Closed kickbox closed 7 years ago
@kickbox DO you know about verbose errors? See errors
param in connect()
@sckott here is the result with verbose()
and witherrors="complete"
in connect()
> res <- scroll(scroll_id = q$`_scroll_id`, config=c(progress(),verbose()), raw=T)
-> POST /_search/scroll?scroll=1m HTTP/1.1
-> Host: xxx
-> Authorization: Basic xxx
-> User-Agent: libcurl/7.51.0 r-curl/2.3 httr/1.2.1
-> Accept-Encoding: gzip, deflate
-> Accept: application/json, text/xml, application/xml, */*
-> Content-Length: 236
->
>> $$28$$$$YxgBfq58WeYOJfMf67QZSPFNXxY=c2Nhbjs0OzQ2MTA3Nzo2cDUyZ3JBX1JRQ0NfYTJ4Ny1JX0h3OzU2MTQ1MTE6QXNjMHVuem5Sci1MS1VHVFRJb2ZuQTs1NjE0NTEyOkFzYzB1bnpuUnItTEtVR1RUSW9mbkE7MTU3NjI5NzpJR2JMVVVhZ1JfQzJtSThabjJ0MHRROzE7dG90YWxfaGl0czozNjExODs=
<- HTTP/1.1 404 Not Found
<- Content-Type: application/json; charset=UTF-8
<- Content-Length: 749
<-
|========================================================================================================| 100%
Error: 404 - error
ES stack trace:
_scroll_id: $$28$$$$JY0PW6q3N3iX_gBjE2UQCiB517c=c2NhbjswOzE7dG90YWxfaGl0czozNjExODs=
took: 5
timed_out: FALSE
_shards.total: 4
_shards.successful: 0
_shards.failed: 4
_shards.failures.shard: -1
_shards.failures.reason.type: search_context_missing_exception
_shards.failures.reason.reason: No search context found for id [461077]
_shards.failures.shard: -1
_shards.failures.reason.type: search_context_missing_exception
_shards.failures.reason.reason: No search context found for id [5614511]
_shards.failures.shard: -1
_shards.failures.reason.type: search_context_missing_exception
_shards.failures.reason.reason: No search context found for id [5614512]
_shards.failures.shard: -1
_shards.failures.reason.type: search_context_missing_exception
_shards.failures.reason.reason: No search context found for id [1576297]
hits.total: 36118
hits.max_score: 0
@kickbox this thread seems very relavant https://discuss.elastic.co/t/searchcontextmissingexception-during-long-scroll-scan-operations/23775 - are you using the same scroll id for each request? you need to use the scroll id returned from request 1 in the next request (aka, request 2), and so on
@sckott thanks. But I am not reusing the old scroll ID. The scroll ID for the second request is
"$$28$$$$7mxCMQEIJ2_YNqJuZWxCcoytpLc=c2Nhbjs0OzQ4NzcxNjo2cDUyZ3JBX1JRQ0NfYTJ4Ny1JX0h3OzU2NTA5OTY6QXNjMHVuem5Sci1MS1VHVFRJb2ZuQTs1NjUwOTk3OkFzYzB1bnpuUnItTEtVR1RUSW9mbkE7MzE4MzEzNzpjOEpiYUU1VlJiQ2tKemQxRGwxeTRBOzE7dG90YWxfaGl0czozNjIwNTs="
However I think this doesn't match with the scroll request from R, though I specified the same value via scroll()
. Please see the entire logic below, the request scroll ID seems to be different from above
_scroll_id: $$28$$$$CmZQSMcbrrZ5CE9SfuefyJjQ-wc=c2NhbjswOzE7dG90YWxfaGl0czozNjIwNTs=
`> q$`_scroll_id`
[1] "$$28$$$$7mxCMQEIJ2_YNqJuZWxCcoytpLc=c2Nhbjs0OzQ4NzcxNjo2cDUyZ3JBX1JRQ0NfYTJ4Ny1JX0h3OzU2NTA5OTY6QXNjMHVuem5Sci1MS1VHVFRJb2ZuQTs1NjUwOTk3OkFzYzB1bnpuUnItTEtVR1RUSW9mbkE7MzE4MzEzNzpjOEpiYUU1VlJiQ2tKemQxRGwxeTRBOzE7dG90YWxfaGl0czozNjIwNTs="
> scrollId <- q$`_scroll_id`
> res <- scroll(scroll_id = scrollId, config=c(progress(),verbose()), raw=T)
-> POST /_search/scroll?scroll=1m HTTP/1.1
-> Host: xxx
-> Authorization: Basic xxx
-> User-Agent: libcurl/7.51.0 r-curl/2.3 httr/1.2.1
-> Accept-Encoding: gzip, deflate
-> Accept: application/json, text/xml, application/xml, */*
-> Content-Length: 236
->
>> $$28$$$$7mxCMQEIJ2_YNqJuZWxCcoytpLc=c2Nhbjs0OzQ4NzcxNjo2cDUyZ3JBX1JRQ0NfYTJ4Ny1JX0h3OzU2NTA5OTY6QXNjMHVuem5Sci1MS1VHVFRJb2ZuQTs1NjUwOTk3OkFzYzB1bnpuUnItTEtVR1RUSW9mbkE7MzE4MzEzNzpjOEpiYUU1VlJiQ2tKemQxRGwxeTRBOzE7dG90YWxfaGl0czozNjIwNTs=
<- HTTP/1.1 404 Not Found
<- Content-Type: application/json; charset=UTF-8
<- Content-Length: 749
<-
|========================================================================================================| 100%
Error: 404 - error
ES stack trace:
_scroll_id: $$28$$$$CmZQSMcbrrZ5CE9SfuefyJjQ-wc=c2NhbjswOzE7dG90YWxfaGl0czozNjIwNTs=
took: 6
timed_out: FALSE
_shards.total: 4
_shards.successful: 0
_shards.failed: 4
_shards.failures.shard: -1
_shards.failures.reason.type: search_context_missing_exception
_shards.failures.reason.reason: No search context found for id [487716]
_shards.failures.shard: -1
_shards.failures.reason.type: search_context_missing_exception
_shards.failures.reason.reason: No search context found for id [5650996]
_shards.failures.shard: -1
_shards.failures.reason.type: search_context_missing_exception
_shards.failures.reason.reason: No search context found for id [5650997]
_shards.failures.shard: -1
_shards.failures.reason.type: search_context_missing_exception
_shards.failures.reason.reason: No search context found for id [3183137]
hits.total: 36205
hits.max_score: 0
> `
@sckott I think I have found the solution. I had to specify scroll_time not only on the initial Search()
but also on the subsequent scroll()
too. This fixes this. Thanks.
Ah, so scroll time isn't being carried over - maybe we can carry it over somehow - but allow user to override it if they desire with setting a new scroll time when calling scroll()
thoughts?
My thoughts :) That could be a way to go. But as a package-design-choice I would suggest to keep the same defaults as the official elasticsearch client api, to be consistent with your other defaults.
I fell into this problem by blindly following the example in the documentation of your package. So may be this behaviour can be explicitly mentioned in the "scroll" example. That could be one way to prevent this..
First scroll works well. I get the following error on the second run.
My code adapted from the nice scrolling example in your documentation.
My question
Scroll does not return any object in case of error. If there is an error like this I would like to check the return value and stop processing.