toluaina / pgsync

Postgres to Elasticsearch/OpenSearch sync
https://pgsync.com
MIT License
1.1k stars 172 forks source link

Read timed out error #502

Open nsupegemini opened 8 months ago

nsupegemini commented 8 months ago

PGSync version: 2.5.0

Postgres version: 12.10

Elasticsearch version: 7.17.6

Redis version: Redis server v=7.0.11

Python version: Python 3.9.5

Problem Description: Looks like ELASTICSEARCH_TIMEOUT is not used in bulk update https://github.com/toluaina/pgsync/blob/95116702c4b314d8b97696ef857cfe116241e236/pgsync/search_client.py#L188

Error Message (if any):

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/lib/python3.9/http/client.py", line 1345, in getresponse
    response.begin()
  File "/usr/local/lib/python3.9/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.9/http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/local/lib/python3.9/socket.py", line 704, in readinto
    return self._sock.recv_into(b)
  File "/usr/local/lib/python3.9/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/local/lib/python3.9/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/connection/http_urllib3.py", line 251, in perform_request
    response = self.pool.urlopen(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 525, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.9/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 451, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 340, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='vpc-search-v2-dev-c7n7texcie5q5kscz6axrv3u6q.us-east-1.es.amazonaws.com', port='443'): Read timed out. (read timeout=10)
2023-11-03 15:42:20.387:ERROR:pgsync.search_client: Exception ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='vpc-search-v2-dev-c7n7texcie5q5kscz6axrv3u6q.us-east-1.es.amazonaws.com', port='443'): Read timed out. (read timeout=10))
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/lib/python3.9/http/client.py", line 1345, in getresponse
    response.begin()
  File "/usr/local/lib/python3.9/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.9/http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/local/lib/python3.9/socket.py", line 704, in readinto
    return self._sock.recv_into(b)
Exception in poll_redis() for thread Thread-242: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='vpc-search-v2-dev-c7n7texcie5q5kscz6axrv3u6q.us-east-1.es.amazonaws.com', port='443'): Read timed out. (read timeout=10))
Exiting...
  File "/usr/local/lib/python3.9/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/local/lib/python3.9/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/connection/http_urllib3.py", line 251, in perform_request
    response = self.pool.urlopen(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 525, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.9/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 451, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 340, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='vpc-search-v2-dev-c7n7texcie5q5kscz6axrv3u6q.us-east-1.es.amazonaws.com', port='443'): Read timed out. (read timeout=10)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pgsync/search_client.py", line 133, in bulk
    self._bulk(
  File "/usr/local/lib/python3.9/site-packages/pgsync/search_client.py", line 188, in _bulk
    for _ in self.parallel_bulk(
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 472, in parallel_bulk
    for result in pool.imap(
  File "/usr/local/lib/python3.9/multiprocessing/pool.py", line 870, in next
    raise value
  File "/usr/local/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.9/multiprocessing/pool.py", line 144, in _helper_reraises_exception
    raise ex
  File "/usr/local/lib/python3.9/multiprocessing/pool.py", line 388, in _guarded_task_generation
    for i, x in enumerate(iterable):
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 155, in _chunk_actions
    for action, data in actions:
  File "/usr/local/lib/python3.9/site-packages/pgsync/sync.py", line 938, in _payloads
    yield from self.sync(
  File "/usr/local/lib/python3.9/site-packages/pgsync/sync.py", line 1011, in sync
    doc = next(self._plugins.transform([doc]))
  File "/usr/local/lib/python3.9/site-packages/pgsync/plugin.py", line 77, in transform
    doc["_source"] = plugin.transform(
  File "/pgsync/plugins/products_index_update_product_creators_attribute_plugin.py", line 32, in transform
    contract_obj_id = self.get_old_contract_obj_id(doc)
  File "/pgsync/plugins/products_index_update_product_creators_attribute_plugin.py", line 91, in get_old_contract_obj_id
    response = s.execute().to_dict()
  File "/usr/local/lib/python3.9/site-packages/elasticsearch_dsl/search.py", line 715, in execute
    self, es.search(index=self._index, body=self.to_dict(), **self._params)
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/client/utils.py", line 168, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/client/__init__.py", line 1670, in search
    return self.transport.perform_request(
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/transport.py", line 415, in perform_request
    raise e
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/transport.py", line 381, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/usr/local/lib/python3.9/site-packages/elasticsearch/connection/http_urllib3.py", line 265, in perform_request
    raise ConnectionTimeout("TIMEOUT", str(e), e)
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='vpc-search-v2-dev-c7n7texcie5q5kscz6axrv3u6q.us-east-1.es.amazonaws.com', port='443'): Read timed out. (read timeout=10))
NiraliSupe commented 8 months ago

After investing,I found issue was at other place. It is properly passing timeout . Please feel free to close it

lukeajtodd commented 7 months ago

@NiraliSupe Sorry to dig this back up but I figured I would ask how you resolved this before posting another issue? I look to be having a very similar problem

nsupegemini commented 7 months ago

In one of the plugin, I was going directly to Elasticsearch to get existing record and I was missing timeout in the plugin. I did went through pgsync code and timeout field is getting passed to bulk update function. I needed to set environment variable ELASTICSEARCH_TIMEOUT . Hope this helps