DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

LimitedTimeoutException during `post_deploy`, data.terra.bio read timed out #5766

Open dsotirho-ucsc opened 10 months ago

dsotirho-ucsc commented 10 months ago

(possible dup of https://github.com/DataBiosphere/azul/issues/5227)

Failed deploy job 50074 (during post_deploy) on GitLab prod

…
2023-12-07 01:15:39,164 WARNING ThreadPoolExecutor-0_0 urllib3.connectionpool: Retrying (LimitedRetry(total=None, connect=2, read=2, redirect=0, status=2)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)")': /api/repository/v1/snapshots/9310d6e4-b692-4446-8fe9-ae115f5064a8
2023-12-07 01:15:45,315 WARNING ThreadPoolExecutor-0_2 urllib3.connectionpool: Retrying (LimitedRetry(total=None, connect=2, read=2, redirect=0, status=2)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)")': /api/repository/v1/snapshots/f6819b48-e41b-494a-badb-b540beec91d3
2023-12-07 01:15:53,915 WARNING ThreadPoolExecutor-0_5 urllib3.connectionpool: Retrying (LimitedRetry(total=None, connect=2, read=1, redirect=0, status=2)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)")': /api/repository/v1/snapshots/760d16c2-47a7-435f-9605-d56d01021c2c
2023-12-07 01:15:54,311 WARNING ThreadPoolExecutor-0_1 urllib3.connectionpool: Retrying (LimitedRetry(total=None, connect=2, read=2, redirect=0, status=2)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)")': /api/repository/v1/snapshots/8ce6c5c2-8818-4a1d-847a-a1d332f4aee9
2023-12-07 01:15:54,339 WARNING ThreadPoolExecutor-0_6 urllib3.connectionpool: Retrying (LimitedRetry(total=None, connect=2, read=1, redirect=0, status=2)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)")': /api/repository/v1/snapshots/f1a6793e-36f0-431e-b22e-4841d4ac3a0f
2023-12-07 01:15:54,744 WARNING ThreadPoolExecutor-0_7 urllib3.connectionpool: Retrying (LimitedRetry(total=None, connect=2, read=1, redirect=0, status=2)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)")': /api/repository/v1/snapshots/c70bbcab-e2d4-4b5e-9c48-1774ab7e35c3
…
Traceback (most recent call last):
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 462, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/http/client.py", line 1378, in getresponse
    response.begin()
  File "/usr/local/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/http/client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/socket.py", line 706, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/ssl.py", line [1311](https://gitlab.azul.data.humancellatlas.org/ucsc/azul/-/jobs/50074#L1311), in recv_into
    return self.read(nbytes, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/ssl.py", line 1167, in read
    return self._sslobj.read(len, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 469, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 358, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builds/ucsc/azul/src/azul/http.py", line 166, in urlopen
    return super().urlopen(method, url, retries=retry, timeout=timeout, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/http.py", line 42, in urlopen
    return self._inner.urlopen(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/google/auth/transport/urllib3.py", line 385, in urlopen
    response = self.http.urlopen(
               ^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/http.py", line 42, in urlopen
    return self._inner.urlopen(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/http.py", line 68, in urlopen
    response = super().urlopen(method, url, body=body, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/http.py", line 42, in urlopen
    return self._inner.urlopen(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/poolmanager.py", line 376, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 827, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 827, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 827, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='data.terra.bio', port=443): Max retries exceeded with url: /api/repository/v1/snapshots/760d16c2-47a7-435f-9605-d56d01021c2c (Caused by ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)"))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builds/ucsc/azul/scripts/post_deploy_tdr.py", line 101, in <module>
    main()
  File "/builds/ucsc/azul/scripts/post_deploy_tdr.py", line 96, in main
    verify_sources()
  File "/builds/ucsc/azul/scripts/post_deploy_tdr.py", line 61, in verify_sources
    raise e
  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/scripts/post_deploy_tdr.py", line 65, in verify_source
    source = tdr.lookup_source(source_spec)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/terra.py", line 415, in lookup_source
    source = self._lookup_source(source_spec)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/terra.py", line 444, in _lookup_source
    return self._retrieve_source(SourceRef(id=source_id, spec=source))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/terra.py", line 428, in _retrieve_source
    response = self._request('GET', endpoint)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/terra.py", line 334, in _request
    response = self._http_client.request(method,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/request.py", line 77, in request
    return self.request_encode_url(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/build/.venv/lib/python3.11/site-packages/urllib3/request.py", line 99, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builds/ucsc/azul/src/azul/http.py", line 168, in urlopen
    raise LimitedTimeoutException(url, timeout)
azul.http.LimitedTimeoutException: No response from https://data.terra.bio/api/repository/v1/snapshots/760d16c2-47a7-435f-9605-d56d01021c2c within 20 seconds
make: *** [Makefile:103: auto_deploy] Error 1
achave11-ucsc commented 10 months ago

This happened again in anvilprod https://gitlab.prod.anvil.gi.ucsc.edu/ucsc/azul/-/jobs/17765#L1174 and hammerbox https://gitlab.prod.anvil.gi.ucsc.edu/ucsc/azul/-/jobs/17775#L1520.

achave11-ucsc commented 10 months ago

Also happened during prod deploy job https://gitlab.azul.data.humancellatlas.org/ucsc/azul/-/jobs/50439#L1408.

achave11-ucsc commented 10 months ago

This happened again during anvilbox deploy job. https://gitlab.anvil.gi.ucsc.edu/ucsc/azul/-/jobs/30214#L1006