elastic / elasticsearch-rs

Official Elasticsearch Rust Client
https://www.elastic.co/guide/en/elasticsearch/client/rust-api/current/index.html
Apache License 2.0
702 stars 72 forks source link

[DOCS] How to use the Scroll API? #67

Closed DevQps closed 4 years ago

DevQps commented 4 years ago

Please describe the documentation Currently the Scroll API does not provide any examples. It would be great if these could be provided, such that it would be easier to use for beginners.

Describe where the documentation should for The Scroll struct should be documented with examples on how to use them.

russcam commented 4 years ago

Here's an example of how to use the scroll API

fn print_hits(hits: &Vec<Value>) {
    for hit in hits {
        println!(
            "id: {}, source: '{:?}', score: {}",
            hit["_id"].as_str().unwrap(),
            hit["_source"],
            hit["_score"].as_f64().unwrap()
        );
    }
}

// The time out for how long Elasticsearch will keep the scroll alive.
// Set to a reasonable amount that gives you enough time to process
// a scroll response
let scroll = "1m";

let mut response = client
    .search(SearchParts::Index(&["tweets"]))
    .scroll(scroll)
    .body(json!({
        "query": {
            "match": {
                "message": "Elasticsearch rust client"
            }
        }
    }))
    .send()
    .await?;

let mut response_body = response.read_body::<Value>().await?;
let mut scroll_id = response_body["_scroll_id"].as_str().unwrap();
let mut hits = response_body["hits"]["hits"].as_array().unwrap();

print_hits(hits);

// while hits are returned, keep asking for the next batch
while hits.len() > 0 {
    response = client
        // can pass the scroll_id as part of the parts or with scroll_id, 
        // which will end up in the URI, or can pass in the .body() as demonstrated here
        .scroll(ScrollParts::None)
        .body(json!({
            "scroll": scroll,
            "scroll_id": scroll_id
        }))
        .send()
        .await?;

    response_body = response.read_body::<Value>().await?;
    // get the scroll_id from this response
    scroll_id = response_body["_scroll_id"].as_str().unwrap();
    hits = response_body["hits"]["hits"].as_array().unwrap();
    print_hits(hits);
}

// tell Elasticsearch that we're finished with this scroll.
// The scroll would be cleared when the scroll timeout is reached,
// but it's better to free it as soon as it's no longer needed
response = client
    .clear_scroll(ClearScrollParts::None)
    .body(json!({
        "scroll_id": [scroll_id]
    }))
    .send()
    .await?;

Similar to the reasoning in #66, #64 will generate rust examples for all the examples in the elastic.co reference documentation, so once these doc examples are implemented, I think we could look at weaving them into docs.rs, tackling the problem more generally.

russcam commented 4 years ago

There are plans to wrap this functionality up into a scroll_helper in https://github.com/elastic/elasticsearch-rs/issues/63, that would also be able to slice a scroll, allowing for concurrent scrolling.

DevQps commented 4 years ago

Thanks a lot for your response! I will try this example tomorrow again. Great to hear about #64 I think that will improve the docs a lot!