opensearch-project / opensearch-py

Python Client for OpenSearch
https://opensearch.org/docs/latest/clients/python/
Apache License 2.0
359 stars 178 forks source link

ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='', port=443): Read timed out. (read timeout=10)) #723

Closed HGPai closed 7 months ago

HGPai commented 7 months ago

What is the bug?

I've been working on replicating the solution in this repository : https://github.com/aws-samples/unlocking-the-potential-of-generative-ai-in-industrial-operations/tree/main?tab=readme-ov-file

on AWS. While working on the OpenSearch Serverless on AWS in the app_bedrock.py code (line 230), I encountered the above error.

How can one reproduce the bug?

aoss_host = 'f"{os.path.basename(aoss_collection_arn)}.{region}.aoss.amazonaws.com" ### < Insert AOSS host here>

client = OpenSearch( hosts = [{'host': aoss_host, 'port': 443}], http_auth = auth, use_ssl = False, verify_certs = False,
pool_maxsize = 20, connection_class = RequestsHttpConnection )

What is the expected behavior?

The Open search client should return the necessary response to the query.

What is your host/environment?

AWS EC2

Do you have any screenshots?

image

Do you have any additional context?

saimedhi commented 7 months ago

I'll attempt to reproduce the issue. Here's an example guide for OpenSearch Serverless usage that might help you: here.

dblock commented 7 months ago

I am 100% sure the client works with AOSS, here's a demo you can run: https://github.com/dblock/opensearch-python-client-demo. I would ensure that the host that ends up being used is correct, and that connectivity is established. Then maybe open an issue in https://github.com/aws-samples/unlocking-the-potential-of-generative-ai-in-industrial-operations/?

I'm going to close this, but feel free to ask more questions if you're stuck.

ksajan commented 6 months ago

@dblock Sorry for hijacking this thread but I am also facing this issue. Whenever the load increases by little we receive this error. Please tell me if there is anything that I can do to mitigate this or improve the client code.

                self.client = OpenSearch(
                    hosts=[{"host": host, "port": port}],
                    http_auth=(username, password),
                    use_ssl=True,
                    verify_certs=False,
                    ssl_assert_hostname=False,
                    ssl_show_warn=False,
                )

ER-TECH-0003 Error in searching in ES ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='vpc-norge-production-df4b7ilqgpt2ykdkzu25p4fc5q.ap-south-1.es.amazonaws.com', port=443): Read timed out. (read timeout=10))

Traceback (most recent call last):
  File "/home/dcx/.local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/home/dcx/.local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 462, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/http/client.py", line 1423, in getresponse
    response.begin()
  File "/usr/local/lib/python3.12/http/client.py", line 331, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/http/client.py", line 292, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/socket.py", line 707, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/ssl.py", line 1252, in recv_into
    return self.read(nbytes, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/ssl.py", line 1104, in read
    return self._sslobj.read(len, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dcx/.local/lib/python3.12/site-packages/opensearchpy/connection/http_urllib3.py", line 249, in perform_request
    response = self.pool.urlopen(
               ^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/urllib3/util/retry.py", line 525, in increment
    raise six.reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/home/dcx/.local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 469, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/home/dcx/.local/lib/python3.12/site-packages/urllib3/connectionpool.py", line 358, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='vpc-norge-production-df4b7ilqgpt2ykdkzu25p4fc5q.ap-south-1.es.amazonaws.com', port=443): Read timed out. (read timeout=10)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dcx/app/api/database/elasticsearch/client.py", line 188, in search
    response = self.client.search(index=collection_name, body=query_body)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/opensearchpy/client/utils.py", line 177, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/opensearchpy/client/__init__.py", line 1544, in search
    return self.transport.perform_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/ddtrace/contrib/elasticsearch/patch.py", line 244, in _perform_request
    result = next(coro)
             ^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/ddtrace/contrib/elasticsearch/patch.py", line 204, in _perform_request
    result = yield func(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/opensearchpy/transport.py", line 407, in perform_request
    raise e
  File "/home/dcx/.local/lib/python3.12/site-packages/opensearchpy/transport.py", line 368, in perform_request
    status, headers_response, data = connection.perform_request(
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dcx/.local/lib/python3.12/site-packages/opensearchpy/connection/http_urllib3.py", line 263, in perform_request
    raise ConnectionTimeout("TIMEOUT", str(e), e)
opensearchpy.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='vpc-norge-production-df4b7ilqgpt2ykdkzu25p4fc5q.ap-south-1.es.amazonaws.com', port=443): Read timed out. (read timeout=10))
dblock commented 6 months ago

@ksajan is the request taking longer than 10 seconds? You can pass timeout=60 to increase it.

ksajan commented 6 months ago

@dblock Is there a way to improve the performance such that the request is not taking longer than 10 seconds or longer?

dblock commented 6 months ago

@ksajan Hopefully :) I think the first thing to do is to narrow it down to the root cause. What part of the request is taking this long? Try making the request with curl, does it take longer than 10 seconds?