DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

Deindexing large snapshots with reindex.py causes a TimeoutError #6291

Open achave11-ucsc opened 5 months ago

achave11-ucsc commented 5 months ago

Deleting snapshot T2T_CHRY (largest at the moment with 309,979 sub-graphs) in anvilprod took longer than 60s to execute, causing an elasticsearch client timeout.

Running …

❯ python scripts/reindex.py --deindex --catalogs anvil6 anvil6-it --sources 'tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3'

… outputted:

2024-05-24 08:12:06,585   DEBUG MainThread __main__: Source glob 'tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3' matched sources ['tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3'] in catalog 'anvil6'
2024-05-24 08:12:06,587   DEBUG MainThread __main__: Source glob 'tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3' matched sources ['tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3'] in catalog 'anvil6-it'
2024-05-24 08:12:06,593    INFO MainThread botocore.credentials: Found credentials in shared credentials file: ~/.aws/credentials
2024-05-24 08:12:06,626    INFO MainThread azul.deployment: Allocated new Boto3 client for 'secretsmanager' with ID 4416954896
2024-05-24 08:12:07,451    INFO MainThread azul.terra: Making GET request to 'https://data.terra.bio/api/repository/v1/snapshots?filter=ANVIL_T2T_CHRY_20240301_ANV5_202403040508&limit=2'
2024-05-24 08:12:07,451   DEBUG MainThread azul.terra: … without request body
2024-05-24 08:12:08,434    INFO MainThread azul.terra: Got 200 response after 0.983s from GET to https://data.terra.bio/api/repository/v1/snapshots?filter=ANVIL_T2T_CHRY_20240301_ANV5_202403040508&limit=2
2024-05-24 08:12:08,434   DEBUG MainThread azul.terra: … with response headers HTTPHeaderDict({'Date': 'Fri, 24 May 2024 15:12:08 GMT', 'Server': 'Apache', 'X-Frame-Options': 'SAMEORIGIN', 'Access-Control-Allow-Headers': 'DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization,Accept,Referer,X-App-Id,Origin', 'Access-Control-Allow-Methods': 'GET,POST,DELETE,PUT,PATCH,OPTIONS,HEAD', 'X-Content-Type-Options': 'nosniff', 'Strict-Transport-Security': 'max-age=31536000;includeSubDomains', 'Cache-Control': 'no-cache,no-store,must-revalidate', 'X-Request-ID': 'aErjmGxe', 'Content-Type': 'application/json', 'Content-Length': '891', 'Vary': 'Accept-Encoding,Origin', 'Via': '1.1 google', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'})
2024-05-24 08:12:08,434   DEBUG MainThread azul.terra: … with response body b'{"total":1881,"filteredTotal":1,"items":[{"id":"f4accfc6-d9e4-49b1-a590-6a580b4d305f","name":"ANVIL_T2T_CHRY_20240301_ANV5_20...'
2024-05-24 08:12:08,435    INFO MainThread azul.terra: Making GET request to 'https://data.terra.bio/api/repository/v1/snapshots/f4accfc6-d9e4-49b1-a590-6a580b4d305f'
2024-05-24 08:12:08,435   DEBUG MainThread azul.terra: … without request body
2024-05-24 08:12:09,363    INFO MainThread azul.terra: Got 200 response after 0.927s from GET to https://data.terra.bio/api/repository/v1/snapshots/f4accfc6-d9e4-49b1-a590-6a580b4d305f
2024-05-24 08:12:09,363   DEBUG MainThread azul.terra: … with response headers HTTPHeaderDict({'Date': 'Fri, 24 May 2024 15:12:08 GMT', 'Server': 'Apache', 'X-Frame-Options': 'SAMEORIGIN', 'Access-Control-Allow-Headers': 'DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization,Accept,Referer,X-App-Id,Origin', 'Access-Control-Allow-Methods': 'GET,POST,DELETE,PUT,PATCH,OPTIONS,HEAD', 'X-Content-Type-Options': 'nosniff', 'Strict-Transport-Security': 'max-age=31536000;includeSubDomains', 'Cache-Control': 'no-cache,no-store,must-revalidate', 'X-Request-ID': 'pozGvk8g', 'Content-Type': 'application/json', 'Content-Length': '37926', 'Vary': 'Accept-Encoding,Origin', 'Via': '1.1 google', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'})
2024-05-24 08:12:09,364   DEBUG MainThread azul.terra: … with response body b'{"id":"f4accfc6-d9e4-49b1-a590-6a580b4d305f","name":"ANVIL_T2T_CHRY_20240301_ANV5_202403040508","description":"Full view snap...'
2024-05-24 08:12:09,381    INFO MainThread azul.deployment: Allocated new Boto3 client for 'es' with ID 4425261456
2024-05-24 08:12:09,943   DEBUG MainThread azul.es: Creating ES client [vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443]
2024-05-24 08:12:09,951    INFO MainThread azul.deployment: Allocated new Boto3 client for 'sts' with ID 4425729680
2024-05-24 08:12:09,961    INFO MainThread botocore.credentials: Found credentials in environment variables.
2024-05-24 08:12:09,962    INFO MainThread azul.azulclient: Deindexing sources {'tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3'} from catalog 'anvil6'
2024-05-24 08:12:09,962   DEBUG MainThread azul.azulclient: Using query: {'query': {'bool': {'should': [{'terms': {'sources.id.keyword': ['f4accfc6-d9e4-49b1-a590-6a580b4d305f']}}, {'terms': {'source.id.keyword': ['f4accfc6-d9e4-49b1-a590-6a580b4d305f']}}]}}}
2024-05-24 08:12:09,962    INFO MainThread elasticsearch: Making POST request to https://vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443/azul_v2_anvilprod_anvil6_activities,azul_v2_anvilprod_anvil6_activities_aggregate,azul_v2_anvilprod_anvil6_biosamples,azul_v2_anvilprod_anvil6_biosamples_aggregate,azul_v2_anvilprod_anvil6_bundles,azul_v2_anvilprod_anvil6_bundles_aggregate,azul_v2_anvilprod_anvil6_datasets,azul_v2_anvilprod_anvil6_datasets_aggregate,azul_v2_anvilprod_anvil6_diagnoses,azul_v2_anvilprod_anvil6_diagnoses_aggregate,azul_v2_anvilprod_anvil6_donors,azul_v2_anvilprod_anvil6_donors_aggregate,azul_v2_anvilprod_anvil6_files,azul_v2_anvilprod_anvil6_files_aggregate,azul_v2_anvilprod_anvil6_replica/_delete_by_query?slices=auto
2024-05-24 08:12:09,962    INFO MainThread elasticsearch: … with request body b'{"query":{"bool":{"should":[{"terms":{"sources.id.keyword":["f4accfc6-d9e4-49b1-a590-6a580b4d305f"]}},{"terms":{"source.id.ke...'
2024-05-24 08:13:10,328 WARNING MainThread elasticsearch: Got no response after 60.366s from POST to https://vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443/azul_v2_anvilprod_anvil6_activities,azul_v2_anvilprod_anvil6_activities_aggregate,azul_v2_anvilprod_anvil6_biosamples,azul_v2_anvilprod_anvil6_biosamples_aggregate,azul_v2_anvilprod_anvil6_bundles,azul_v2_anvilprod_anvil6_bundles_aggregate,azul_v2_anvilprod_anvil6_datasets,azul_v2_anvilprod_anvil6_datasets_aggregate,azul_v2_anvilprod_anvil6_diagnoses,azul_v2_anvilprod_anvil6_diagnoses_aggregate,azul_v2_anvilprod_anvil6_donors,azul_v2_anvilprod_anvil6_donors_aggregate,azul_v2_anvilprod_anvil6_files,azul_v2_anvilprod_anvil6_files_aggregate,azul_v2_anvilprod_anvil6_replica/_delete_by_query?slices=auto
Traceback (most recent call last):
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 462, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/http/client.py", line 1395, in getresponse
    response.begin()
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/http/client.py", line 325, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/http/client.py", line 286, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/socket.py", line 706, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/ssl.py", line 1314, in recv_into
    return self.read(nbytes, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/ssl.py", line 1166, in read
    return self._sslobj.read(len, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 255, in perform_request
    response = self.pool.urlopen(
               ^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/src/azul/es.py", line 198, in urlopen
    return self._inner.urlopen(method, url, body, headers=request.headers, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 525, in increment
    raise six.reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 469, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 358, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com', port=443): Read timed out. (read timeout=60)
2024-05-24 08:13:10,341 WARNING MainThread elasticsearch: … without response body
Traceback (most recent call last):
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 462, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/http/client.py", line 1395, in getresponse
    response.begin()
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/http/client.py", line 325, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/http/client.py", line 286, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/socket.py", line 706, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/ssl.py", line 1314, in recv_into
    return self.read(nbytes, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/.pyenv/versions/3.11.9/lib/python3.11/ssl.py", line 1166, in read
    return self._sslobj.read(len, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 255, in perform_request
    response = self.pool.urlopen(
               ^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/src/azul/es.py", line 198, in urlopen
    return self._inner.urlopen(method, url, body, headers=request.headers, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 525, in increment
    raise six.reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/packages/six.py", line 770, in reraise
    raise value
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 469, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 358, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com', port=443): Read timed out. (read timeout=60)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/achave11/Pycharm/Azul/azul.stable/scripts/reindex.py", line 206, in <module>
    main(sys.argv[1:])
  File "/Users/achave11/Pycharm/Azul/azul.stable/scripts/reindex.py", line 170, in main
    azul.deindex(catalog, sources)
  File "/Users/achave11/Pycharm/Azul/azul.stable/src/azul/azulclient.py", line 441, in deindex
    response = es_client.delete_by_query(index=indices, body=query, slices='auto')
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/client/utils.py", line 347, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/client/__init__.py", line 738, in delete_by_query
    return self.transport.perform_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/transport.py", line 466, in perform_request
    raise e
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/transport.py", line 427, in perform_request
    status, headers_response, data = connection.perform_request(
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/src/azul/es.py", line 78, in perform_request
    return super().perform_request(method, url, params, body, timeout, ignore, headers)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 279, in perform_request
    raise ConnectionTimeout("TIMEOUT", str(e), e)
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com', port=443): Read timed out. (read timeout=60))

Retrying this command shortly after the first run returned 409 responses for each of the indices in ElasticSearch:

2024-05-24 08:13:51,015    INFO MainThread elasticsearch: Making POST request to https://vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443/azul_v2_anvilprod_anvil6_activities,azul_v2_anvilprod_anvil6_activities_aggregate,azul_v2_anvilprod_anvil6_biosamples,azul_v2_anvilprod_anvil6_biosamples_aggregate,azul_v2_anvilprod_anvil6_bundles,azul_v2_anvilprod_anvil6_bundles_aggregate,azul_v2_anvilprod_anvil6_datasets,azul_v2_anvilprod_anvil6_datasets_aggregate,azul_v2_anvilprod_anvil6_diagnoses,azul_v2_anvilprod_anvil6_diagnoses_aggregate,azul_v2_anvilprod_anvil6_donors,azul_v2_anvilprod_anvil6_donors_aggregate,azul_v2_anvilprod_anvil6_files,azul_v2_anvilprod_anvil6_files_aggregate,azul_v2_anvilprod_anvil6_replica/_delete_by_query?slices=auto
2024-05-24 08:13:51,015    INFO MainThread elasticsearch: … with request body b'{"query":{"bool":{"should":[{"terms":{"sources.id.keyword":["f4accfc6-d9e4-49b1-a590-6a580b4d305f"]}},{"terms":{"source.id.ke...'
2024-05-24 08:13:52,234 WARNING MainThread elasticsearch: Got 409 response after 1.218s from POST to https://vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443/azul_v2_anvilprod_anvil6_activities,azul_v2_anvilprod_anvil6_activities_aggregate,azul_v2_anvilprod_anvil6_biosamples,azul_v2_anvilprod_anvil6_biosamples_aggregate,azul_v2_anvilprod_anvil6_bundles,azul_v2_anvilprod_anvil6_bundles_aggregate,azul_v2_anvilprod_anvil6_datasets,azul_v2_anvilprod_anvil6_datasets_aggregate,azul_v2_anvilprod_anvil6_diagnoses,azul_v2_anvilprod_anvil6_diagnoses_aggregate,azul_v2_anvilprod_anvil6_donors,azul_v2_anvilprod_anvil6_donors_aggregate,azul_v2_anvilprod_anvil6_files,azul_v2_anvilprod_anvil6_files_aggregate,azul_v2_anvilprod_anvil6_replica/_delete_by_query?slices=auto
2024-05-24 08:13:52,235 WARNING MainThread elasticsearch: … with response body '{"took":261,"timed_out":false,"total":215899,"deleted":0,"batches":1,"version_conflicts":1000,"noops":0,"retries":{"bulk":0,"se…'
Traceback (most recent call last):
  File "/Users/achave11/Pycharm/Azul/azul.stable/scripts/reindex.py", line 206, in <module>
    main(sys.argv[1:])
  File "/Users/achave11/Pycharm/Azul/azul.stable/scripts/reindex.py", line 170, in main
    azul.deindex(catalog, sources)
  File "/Users/achave11/Pycharm/Azul/azul.stable/src/azul/azulclient.py", line 441, in deindex
    response = es_client.delete_by_query(index=indices, body=query, slices='auto')
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/client/utils.py", line 347, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/client/__init__.py", line 738, in delete_by_query
    return self.transport.perform_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/transport.py", line 466, in perform_request
    raise e
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/transport.py", line 427, in perform_request
    status, headers_response, data = connection.perform_request(
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/src/azul/es.py", line 78, in perform_request
    return super().perform_request(method, url, params, body, timeout, ignore, headers)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 291, in perform_request
    self._raise_error(response.status, raw_data)
  File "/Users/achave11/Pycharm/Azul/azul.stable/.venv/lib/python3.11/site-packages/elasticsearch/connection/base.py", line 328, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
elasticsearch.exceptions.ConflictError: ConflictError(409, '{"took":261,"timed_out":false,"total":215899,"deleted":0,"batches":1,"version_conflicts":1000,"noops":0,"retries":{"bulk":0,"search":0},"throttled_millis":0,"requests_per_second":-1.0,"throttled_until_millis":0,"failures":[{"index":"azul_v2_anvilprod_anvil6_files_aggregate","type":"_doc","id":"60b15ba7-1ad7-40f9-9ef9-e79ec57310d6","cause":{"type":"version_conflict_engine_exception","reason":"[60b15ba7-1ad7-40f9-9ef9-e79ec57310d6]: version conflict, required seqNo [2029195], primary term [1]. but no document was found","index_uuid":"cbM2JoAUSUa6jHuNA5mAmA","shard":"0","index":"azul_v2_anvilprod_anvil6_files_aggregate"},"status":409},{"index":"azul_v2_anvilprod_anvil6_files_aggregate","type":"_doc","id":"5c20fd4d-5e3d-46e7-9aed-cf925159e36a","cause":{"type":"version_conflict_engine_exception","reason":"[5c20fd4d-5e3d-46e7-9aed-cf925159e36a]: version conflict, required seqNo [2029196], primary term [1]. but no document was found","index_uuid":"cbM2JoAUSUa6jHuNA5mAmA","shard":"0","index":"azul_v2_anvilprod_anvil6_files_aggregate"},"status":409},{"index":"azul_v2_anvilprod_anvil6_files_aggregate","type":"_doc","id":"60b3a527-cd46-4dbf-9bd6-d85549754369","cause":{"type":"version_conflict_engine_exception","reason":"[60b3a527-cd46-4dbf-9bd6-d85549754369]: version conflict, required seqNo [2029197], primary term [1]. but no document was found","index_uuid":"cbM2JoAUSUa6jHuNA5mAmA","shard":"0","index":"azul_v2_anvilprod_anvil6_files_aggregate"},"status":409},{"index":"azul_v2_anvilprod_anvil6_files_aggregate","type":"_doc","id":"5ee75279-6373-4ee0-b148-b39baa8b40b5","cause":{"type":"version_conflict_engine_exception","reason":"[5ee75279-6373-4ee0-b148-b39baa8b40b5]: version conflict, required seqNo [2029198], primary term [1]. but no document was found","index_uuid":"cbM2JoAUSUa6jHuNA5mAmA","shard":"0","index":"azul_v2_anvilprod_anvil6_files_aggregate"},"status":409},{"index":"azul_v2_anvilprod_anvil6_files_aggregate","type":"_doc","id":"601cd728-fed4-4336-9c4d-2caa91172d0c","cause":{"type":"version_conflict_engine_exception","reason":"[601cd728-fed4-4336-9c4d-2caa91172d0c]: version conflict, required seqNo [2029199], primary term [1]. but no document was found","index_uuid":"cbM2JoAUSUa6jHuNA5mAmA","shard":"0","index":"azul_v2_anvilprod_anvil6_files_aggregate"},"status":409},{"index":"azul_v2_anvilprod_anvil6_files_aggregate","type":"_doc","id":"5c28569a-eb57-484b-9363-5c7adb8b5eb5","cause":{"type":"version_conflict_engine_exception","reason":"[5c28569a-eb57-484b-9363-5c7adb8b5eb5]: version conflict, required seqNo [2029200], primary term [1]. but no document was found","index_uuid":"cbM2JoAUSUa6jHuNA5mAmA","shard":"0","index":"azul_v2_anvilprod_anvil6_files_aggregate"},"status":409},…'))
achave11-ucsc commented 5 months ago

Re-running the command hours after the second attempt, actually succeeded,

❯ python scripts/reindex.py --deindex --catalogs anvil6 --sources 'tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3'
2024-05-24 14:00:17,352   DEBUG MainThread __main__: Source glob 'tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3' matched sources ['tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3'] in catalog 'anvil6'
2024-05-24 14:00:17,358    INFO MainThread botocore.credentials: Found credentials in shared credentials file: ~/.aws/credentials
2024-05-24 14:00:17,394    INFO MainThread azul.deployment: Allocated new Boto3 client for 'secretsmanager' with ID 4379854800
2024-05-24 14:00:18,191    INFO MainThread azul.terra: Making GET request to 'https://data.terra.bio/api/repository/v1/snapshots?filter=ANVIL_T2T_CHRY_20240301_ANV5_202403040508&limit=2'
2024-05-24 14:00:18,192   DEBUG MainThread azul.terra: … without request body
2024-05-24 14:00:22,369    INFO MainThread azul.terra: Got 200 response after 4.177s from GET to https://data.terra.bio/api/repository/v1/snapshots?filter=ANVIL_T2T_CHRY_20240301_ANV5_202403040508&limit=2
2024-05-24 14:00:22,369   DEBUG MainThread azul.terra: … with response headers HTTPHeaderDict({'Date': 'Fri, 24 May 2024 21:00:22 GMT', 'Server': 'Apache', 'X-Frame-Options': 'SAMEORIGIN', 'Access-Control-Allow-Headers': 'DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization,Accept,Referer,X-App-Id,Origin', 'Access-Control-Allow-Methods': 'GET,POST,DELETE,PUT,PATCH,OPTIONS,HEAD', 'X-Content-Type-Options': 'nosniff', 'Strict-Transport-Security': 'max-age=31536000;includeSubDomains', 'Cache-Control': 'no-cache,no-store,must-revalidate', 'X-Request-ID': 'pV5Mb5bB', 'Content-Type': 'application/json', 'Content-Length': '891', 'Vary': 'Accept-Encoding,Origin', 'Via': '1.1 google', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'})
2024-05-24 14:00:22,370   DEBUG MainThread azul.terra: … with response body b'{"total":1737,"filteredTotal":1,"items":[{"id":"f4accfc6-d9e4-49b1-a590-6a580b4d305f","name":"ANVIL_T2T_CHRY_20240301_ANV5_20...'
2024-05-24 14:00:22,371    INFO MainThread azul.terra: Making GET request to 'https://data.terra.bio/api/repository/v1/snapshots/f4accfc6-d9e4-49b1-a590-6a580b4d305f'
2024-05-24 14:00:22,371   DEBUG MainThread azul.terra: … without request body
2024-05-24 14:00:42,374 WARNING MainThread urllib3.connectionpool: Retrying (_LimitedRetry(total=None, connect=2, read=2, redirect=0, status=2)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)")': /api/repository/v1/snapshots/f4accfc6-d9e4-49b1-a590-6a580b4d305f
2024-05-24 14:01:02,554 WARNING MainThread urllib3.connectionpool: Retrying (_LimitedRetry(total=None, connect=2, read=1, redirect=0, status=2)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='data.terra.bio', port=443): Read timed out. (read timeout=20)")': /api/repository/v1/snapshots/f4accfc6-d9e4-49b1-a590-6a580b4d305f
2024-05-24 14:01:20,613    INFO MainThread azul.terra: Got 200 response after 58.242s from GET to https://data.terra.bio/api/repository/v1/snapshots/f4accfc6-d9e4-49b1-a590-6a580b4d305f
2024-05-24 14:01:20,613   DEBUG MainThread azul.terra: … with response headers HTTPHeaderDict({'Date': 'Fri, 24 May 2024 21:01:19 GMT', 'Server': 'Apache', 'X-Frame-Options': 'SAMEORIGIN', 'Access-Control-Allow-Headers': 'DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization,Accept,Referer,X-App-Id,Origin', 'Access-Control-Allow-Methods': 'GET,POST,DELETE,PUT,PATCH,OPTIONS,HEAD', 'X-Content-Type-Options': 'nosniff', 'Strict-Transport-Security': 'max-age=31536000;includeSubDomains', 'Cache-Control': 'no-cache,no-store,must-revalidate', 'X-Request-ID': 'a8q1P7JK', 'Content-Type': 'application/json', 'Content-Length': '37926', 'Vary': 'Accept-Encoding,Origin', 'Via': '1.1 google', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'})
2024-05-24 14:01:20,614   DEBUG MainThread azul.terra: … with response body b'{"id":"f4accfc6-d9e4-49b1-a590-6a580b4d305f","name":"ANVIL_T2T_CHRY_20240301_ANV5_202403040508","description":"Full view snap...'
2024-05-24 14:01:20,632    INFO MainThread azul.deployment: Allocated new Boto3 client for 'es' with ID 4381189520
2024-05-24 14:01:21,197   DEBUG MainThread azul.es: Creating ES client [vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443]
2024-05-24 14:01:21,205    INFO MainThread azul.deployment: Allocated new Boto3 client for 'sts' with ID 4381725392
2024-05-24 14:01:21,215    INFO MainThread botocore.credentials: Found credentials in environment variables.
2024-05-24 14:01:21,215    INFO MainThread azul.azulclient: Deindexing sources {'tdr:datarepo-e5b16a5a:snapshot/ANVIL_T2T_CHRY_20240301_ANV5_202403040508:/3'} from catalog 'anvil6'
2024-05-24 14:01:21,215   DEBUG MainThread azul.azulclient: Using query: {'query': {'bool': {'should': [{'terms': {'sources.id.keyword': ['f4accfc6-d9e4-49b1-a590-6a580b4d305f']}}, {'terms': {'source.id.keyword': ['f4accfc6-d9e4-49b1-a590-6a580b4d305f']}}]}}}
2024-05-24 14:01:21,216    INFO MainThread elasticsearch: Making POST request to https://vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443/azul_v2_anvilprod_anvil6_activities,azul_v2_anvilprod_anvil6_activities_aggregate,azul_v2_anvilprod_anvil6_biosamples,azul_v2_anvilprod_anvil6_biosamples_aggregate,azul_v2_anvilprod_anvil6_bundles,azul_v2_anvilprod_anvil6_bundles_aggregate,azul_v2_anvilprod_anvil6_datasets,azul_v2_anvilprod_anvil6_datasets_aggregate,azul_v2_anvilprod_anvil6_diagnoses,azul_v2_anvilprod_anvil6_diagnoses_aggregate,azul_v2_anvilprod_anvil6_donors,azul_v2_anvilprod_anvil6_donors_aggregate,azul_v2_anvilprod_anvil6_files,azul_v2_anvilprod_anvil6_files_aggregate,azul_v2_anvilprod_anvil6_replica/_delete_by_query?slices=auto
2024-05-24 14:01:21,216    INFO MainThread elasticsearch: … with request body b'{"query":{"bool":{"should":[{"terms":{"sources.id.keyword":["f4accfc6-d9e4-49b1-a590-6a580b4d305f"]}},{"terms":{"source.id.ke...'
2024-05-24 14:01:21,782    INFO MainThread elasticsearch: Got 200 response after 0.566s from POST to https://vpc-azul-index-anvilprod-ggipah4skn2ftt47u4xgvydzqm.us-east-1.es.amazonaws.com:443/azul_v2_anvilprod_anvil6_activities,azul_v2_anvilprod_anvil6_activities_aggregate,azul_v2_anvilprod_anvil6_biosamples,azul_v2_anvilprod_anvil6_biosamples_aggregate,azul_v2_anvilprod_anvil6_bundles,azul_v2_anvilprod_anvil6_bundles_aggregate,azul_v2_anvilprod_anvil6_datasets,azul_v2_anvilprod_anvil6_datasets_aggregate,azul_v2_anvilprod_anvil6_diagnoses,azul_v2_anvilprod_anvil6_diagnoses_aggregate,azul_v2_anvilprod_anvil6_donors,azul_v2_anvilprod_anvil6_donors_aggregate,azul_v2_anvilprod_anvil6_files,azul_v2_anvilprod_anvil6_files_aggregate,azul_v2_anvilprod_anvil6_replica/_delete_by_query?slices=auto
2024-05-24 14:01:21,782    INFO MainThread elasticsearch: … with response body '{"took":42,"timed_out":false,"total":0,"deleted":0,"batches":0,"version_conflicts":0,"noops":0,"retries":{"bulk":0,"search":0},…'
dsotirho-ucsc commented 5 months ago

@hannes-ucsc: "The solution is most likely to partition the deletion requests so that no request takes longer than 30 seconds, which is a safe margin away from the client timeout of one minute. There may be other solutions. Assignee to consider those. At the moment, the work-around is to retry until the request returns a 200."