Closed Freemanlabs closed 3 years ago
First I get RequestError: 400...
if I try it again, then I see ConnectionTimeoutError
Please try to increase the timeout for elasticsearch via:
es_store = ElasticsearchDocumentStore(..., timeout=3000)
Are you running this on Colab? Elasticsearch might be very slow there as Colab only provides a single CPU core...
I see. I will try that. Although, my Runtime type is GPU
. meaning I am using a GPU on Colab...
Also, do you have any response to why I get this RequestError: 400...
initially?
Although, my Runtime type is GPU
Elasticsearch can not benefit from GPU and will always run just on CPU.
do you have any response to why I get this RequestError: 400...
Can you please provide more context / a script to reproduce this error? My first intuition would be that elasticsearch is probably still starting / or still busy with indexing the added documents / embeddings.
this is the line causing issues
prediction = finder.get_answers(question=que["question"], top_k_retriever=10, top_k_reader=5)
This is the error I get when I run the above line for the firstime after starting and conneting to my server
RequestError: RequestError(400, 'search_phase_execution_exception', 'runtime error')
Then after I get this error, I simply run the cell again, then I get this
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300))
I did as you directed timeout=3000
. still no luck
There is possibility that any one of following may cause it (https://github.com/elastic/elasticsearch/issues/8084) -
null
in filter (This is unlikely as call to get_answers
dont have any filter parameter)Could you please share end to end logs to debug more.
Logs: This is all I see (from colab):
`timeout Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw) 420 # Otherwise it looks like a bug in the code. --> 421 six.raise_from(e, None) 422 except (SocketTimeout, BaseSSLError, SocketError) as e:
21 frames timeout: timed out
During handling of the above exception, another exception occurred:
ReadTimeoutError Traceback (most recent call last) ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300)
During handling of the above exception, another exception occurred:
ConnectionTimeout Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py in perform_request(self, method, url, params, body, timeout, ignore, headers) 255 raise SSLError("N/A", str(e), e) 256 if isinstance(e, ReadTimeoutError): --> 257 raise ConnectionTimeout("TIMEOUT", str(e), e) 258 raise ConnectionError("N/A", str(e), e) 259
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300))`
@Freemanlabs Thank you for update.
Could you please expend these 21 frames
and share full stack-trace.
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300))`
Also in some code flow you timeout is still 300
not 3000
.
Also be aware that Google colab have disk limitation of 108GB out of it only 75GB is available to user. https://neptune.ai/blog/google-colab-dealing-with-files#:~:text=Also%2C%20Colab%20has%20a%20disk,like%20image%20or%20video%20data.
Could you please expend these 21 frames and share full stack-trace.
Expanded frames:
`/usr/local/lib/python3.6/dist-packages/urllib3/packages/six.py in raise_from(value, from_value)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw) 415 try: --> 416 httplib_response = conn.getresponse() 417 except BaseException as e:
/usr/lib/python3.6/http/client.py in getresponse(self) 1372 try: -> 1373 response.begin() 1374 except ConnectionError:
/usr/lib/python3.6/http/client.py in begin(self) 310 while True: --> 311 version, status, reason = self._read_status() 312 if status != CONTINUE:
/usr/lib/python3.6/http/client.py in _read_status(self) 271 def _read_status(self): --> 272 line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") 273 if len(line) > _MAXLINE:
/usr/lib/python3.6/socket.py in readinto(self, b) 585 try: --> 586 return self._sock.recv_into(b) 587 except timeout:
timeout: timed out
During handling of the above exception, another exception occurred:
ReadTimeoutError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py in perform_request(self, method, url, params, body, timeout, ignore, headers) 245 response = self.pool.urlopen( --> 246 method, url, body, retries=Retry(False), headers=request_headers, **kw 247 )
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw) 719 retries = retries.increment( --> 720 method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2] 721 )
/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace) 375 # Disabled, indicate to re-raise the error. --> 376 raise six.reraise(type(error), error, _stacktrace) 377
/usr/local/lib/python3.6/dist-packages/urllib3/packages/six.py in reraise(tp, value, tb) 734 raise value.with_traceback(tb) --> 735 raise value 736 finally:
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw) 671 headers=headers, --> 672 chunked=chunked, 673 )
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw) 422 except (SocketTimeout, BaseSSLError, SocketError) as e: --> 423 self._raise_timeout(err=e, url=url, timeout_value=read_timeout) 424 raise
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _raise_timeout(self, err, url, timeout_value) 330 raise ReadTimeoutError( --> 331 self, url, "Read timed out. (read timeout=%s)" % timeout_value 332 )
ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300)
During handling of the above exception, another exception occurred:
ConnectionTimeout Traceback (most recent call last)
Another question I have is why is DPR giving issues? BM25 and TFIDF works okay
Well BM25 and TFIDF do not use script_score query. But DPR needs query_by_embedding
hence it use the similarity function to compare document vectors.
Can you manually increase timeout at this place https://github.com/deepset-ai/haystack/blob/master/haystack/document_store/elasticsearch.py#L574 and use that code to test.
If timeout increase also do not solve you issue then you can try DPR with FAISS document store. Otherwise try to use custom plugin, but in order to use it you would need to make few change in the elasticsearch.py in haystack.
@tholor I think we need to add timeout customisation at following place as well instead of hardcoding. https://github.com/deepset-ai/haystack/blob/master/haystack/document_store/elasticsearch.py#L574
Thank you for your response...
Can you manually increase timeout at this place https://github.com/deepset-ai/haystack/blob/master/haystack/document_store/elasticsearch.py#L574 and use that code to test.
I do not know how to locate this file to change manually. Please assist
If timeout increase also do not solve you issue then you can try DPR with FAISS document store.
I am trying this FAISSDocumentStore. If I do faiss_document_store.get_document_count()
I see 874 (which is the total number of documents I have). But when I pass it to the retriever like so dpr_retriever = DensePassageRetriever(document_store=faiss_document_store)
and try to get answers from finder, I get this INFO
12/03/2020 08:26:05 - INFO - haystack.finder - Got 0 candidates from retriever 12/03/2020 08:26:05 - INFO - haystack.finder - Retriever did not return any documents. Skipping reader ...
when I inspect the retriver like so :
dpr_retriever.retrieve(query="I would expect the remaining TFC protection to remain protected in both")
All I see is an empty array
Creating Embeddings: 100%|██████████| 1/1 [00:00<00:00, 4.84 Batches/s] []
Too many frustration as a first time user of Haysack I must say. My supervisor feels I am not doing anything, because what he should be getting is results not issues.
Sorry, to hear that you have trouble. From what I see, the main issues were around your usage on Colab (Mounting problem, connection time out ...). We will give our best to simplify the experience there, but as mentioned above, Colab has some severe resource limitations when running heavy external services like Elasticsearch or FAISS.
All I see is an empty array
Did you call document_store.update_embeddings(retriever)
as described in this tutorial?
You can also jump on a quick call with one of our engineers if you need further help or share your colab notebook here.
Thanks for pointing out that resource. I was practically following the Haystack documentation. All seems well now
Great! Closing this now. Feel free to re-open if the problem comes up again...
With the same settings for Elasticsearch, I can successfully retrieve prediction answers from BM25 and TFIDF, However, When I try with DPR, I get
ConnectionTimeoutError
How do I resolve this?