ropensci / elastic

R client for the Elasticsearch HTTP API
245 stars 58 forks source link

Scroll returns hits with scroll id before scrolling #229

Closed Jensxy closed 5 years ago

Jensxy commented 5 years ago

Scroll is returning hits on initial search, where I was expecting it to only return a _scroll_id.

When I execute res <- elastic::Search(index = "my_index, body=body, time_scroll = "3m", size = 1000) I expect length(res$hits$hits) to be zero. However, the result is not zero after the package update.

Before the update, length(res$hits$hits) was zero and I had to scroll first to get hits. See the following code.

scrollID <- res$`_scroll_id`
res    <- scroll(scroll_id = scrollID)

What can I do so that scroll is not returning hits on initial search? Is that even possible? Otherwise I have to rewrite my complete code.

sckott commented 5 years ago

thx for the issue @Jensxy

if you try the examples in the elasticsearch docs you do get hits on the initial call with _search (same as what's called in elastic::Search()). So I think that's what's supposed to happen.

What can I do so that scroll is not returning hits on initial search?

i don't think it's possible. I'm not sure why you'd want this?

Jensxy commented 5 years ago

I'm not sure why you'd want this?

No hits were returned on initial search by default by ES 2.4 and the old elastic package. I upgraded my ES version to 6.3.

I know that the default settings have changed. I thought that there might be a way to get the same no hits on initial search in ES 6.3 as well.

And the following example from the package does not work completely.

# Get all results - one approach is to use a while loop
res <- Search(index = 'shakespeare', q="a*", time_scroll="5m",
  body = '{"sort": ["_doc"]}')
out <- list()
hits <- 1
while(hits != 0){
  res <- scroll(res$`_scroll_id`)
  hits <- length(res$hits$hits)
  if(hits > 0)
    out <- c(out, res$hits$hits)

You won't get all results when you are using this example. The hits from the initial search are not included in out. So, the example has to be fixed.


Okay, I see this example is fixed in the new version. Is v0.9 already available?

sckott commented 5 years ago

@Jensxy sorry, v0.9 isn't there yet, you can follow progress on the v0.9 milestone though

you can install from github to get the latest though, so you don't have to wait for new cran version.

sckott commented 5 years ago

closing for now since this seems sorted in dev version