ucldc / rikolti

calisphere harvester 2.0
BSD 3-Clause "New" or "Revised" License
7 stars 3 forks source link

[bug] intermittent `409 Client Error` message appears during the final `create_stage_index` (harvest dag) & `publish_collection` (publish dag) tasks; re-running the step in Airflow runs in success #1095

Open gamontoya opened 3 months ago

gamontoya commented 3 months ago

Example error message:

Page to the first attempt for the log:

==

Some notes: Gabriela initiated batches of 100 collections from the Registry, and experienced that most would run successfully, while some would fail at the final step. From Airflow, re-running the final task resulted in success. These steps apply to both the Harvest (to -stage) DAG as well as the Publish (to -prod) DAG.

Another note: Gabriela initiated 3 collections from the registry, to publish collections to -prod, and also experienced this error. So it seems that this error is not just a huge batch size issue. (Re-running the final task in Airflow works fine though.)

amywieliczka commented 1 month ago

This is an issue with OpenSearch not quite indexing documents fast enough, but as long as the counts are the same, it is harmless.

Adding some more logging output here for clarification:

[2024-10-01, 20:55:44 UTC] {{logging_mixin.py:150}} INFO - 
----------------------------------------
Indexed 16 records to index `rikolti-stg-2024-07-11-t15_35_50` from page `28306/vernacular_metadata_2024-09-30T23:17:42/mapped_metadata_2024-09-30T23:17:54/with_content_urls_2024-09-30T23:18:16/data/0.jsonl`
     all 16 records had is_shown_by field removed
     all 16 records had thumbnail_source field removed
     all 16 records had thumbnail.from-cache field removed
     all 16 records had media_source field removed
     all 16 records had is_shown_at field removed
     all 16 records had item_count field removed
     all 16 records had thumbnail.component_content_harvest_metadata field removed

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[2024-10-01, 20:55:44 UTC] {{logging_mixin.py:150}} INFO - 
----------------------------------------
> Deleting 16 outdated record(s) from collection 28306 in `rikolti-stg-2024-07-11-t15_35_50` index.
 records: outdated versions
      16: 28306/vernacular_metadata_2024-08-06T00:17:50/mapped_metadata_2024-08-06T00:18:02/with_content_urls_2024-08-06T00:18:23
New indexed documents have version: 28306/vernacular_metadata_2024-09-30T23:17:42/mapped_metadata_2024-09-30T23:17:54/with_content_urls_2024-09-30T23:18:16
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[2024-10-01, 20:55:44 UTC] {{logging_mixin.py:150}} INFO - ERROR @ delete_by_query from /usr/local/airflow/dags/rikolti/record_indexer/index_collection.py
[2024-10-01, 20:55:44 UTC] {{logging_mixin.py:150}} INFO - {'batches': 1,
 'deleted': 0,
 'failures': [{'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb05926072]: version conflict, '
                                   'required seqNo [2180119], primary term '
                                   '[1]. current document has seqNo [2613171] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb05926072',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb18554166]: version conflict, '
                                   'required seqNo [2180120], primary term '
                                   '[1]. current document has seqNo [2613172] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb18554166',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb3493655f]: version conflict, '
                                   'required seqNo [2180121], primary term '
                                   '[1]. current document has seqNo [2613173] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb3493655f',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb3869088g]: version conflict, '
                                   'required seqNo [2180122], primary term '
                                   '[1]. current document has seqNo [2613174] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb3869088g',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb47905951]: version conflict, '
                                   'required seqNo [2180123], primary term '
                                   '[1]. current document has seqNo [2613175] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb47905951',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb54731950]: version conflict, '
                                   'required seqNo [2180124], primary term '
                                   '[1]. current document has seqNo [2613176] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb54731950',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb6633617w]: version conflict, '
                                   'required seqNo [2180125], primary term '
                                   '[1]. current document has seqNo [2613177] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb6633617w',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb67360079]: version conflict, '
                                   'required seqNo [2180126], primary term '
                                   '[1]. current document has seqNo [2613178] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb67360079',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb7282085z]: version conflict, '
                                   'required seqNo [2180127], primary term '
                                   '[1]. current document has seqNo [2613179] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb7282085z',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb7452736r]: version conflict, '
                                   'required seqNo [2180128], primary term '
                                   '[1]. current document has seqNo [2613180] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb7452736r',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb79646854]: version conflict, '
                                   'required seqNo [2180129], primary term '
                                   '[1]. current document has seqNo [2613181] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb79646854',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb82718564]: version conflict, '
                                   'required seqNo [2180130], primary term '
                                   '[1]. current document has seqNo [2613182] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb82718564',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb85107671]: version conflict, '
                                   'required seqNo [2180131], primary term '
                                   '[1]. current document has seqNo [2613183] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb85107671',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb8817938x]: version conflict, '
                                   'required seqNo [2180132], primary term '
                                   '[1]. current document has seqNo [2613184] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb8817938x',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb89544588]: version conflict, '
                                   'required seqNo [2180133], primary term '
                                   '[1]. current document has seqNo [2613185] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb89544588',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409},
              {'cause': {'index': 'rikolti-stg-2024-07-11-t15_35_50',
                         'index_uuid': '000011TTU_vGWRRS6QJ-OAmVUzeA',
                         'reason': '[ark:/20775/bb9773574k]: version conflict, '
                                   'required seqNo [2180134], primary term '
                                   '[1]. current document has seqNo [2613186] '
                                   'and primary term [1]',
                         'shard': '0',
                         'type': 'version_conflict_engine_exception'},
               'id': 'ark:/20775/bb9773574k',
               'index': 'rikolti-stg-2024-07-11-t15_35_50',
               'status': 409}],
 'noops': 0,
 'requests_per_second': -1.0,
 'retries': {'bulk': 0, 'search': 0},
 'throttled_millis': 0,
 'throttled_until_millis': 0,
 'timed_out': False,
 'took': 117,
 'total': 16,
 'version_conflicts': 16}
[2024-10-01, 20:55:44 UTC] {{taskinstance.py:1824}} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/decorators/base.py", line 220, in execute
    return_value = super().execute(context)
  File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 181, in execute
    return_value = self.execute_callable()
  File "/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/operators/python.py", line 198, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/usr/local/airflow/dags/rikolti/dags/shared_tasks/indexing_tasks.py", line 123, in stage_collection_task
    index_collection_task("rikolti-stg", collection, version_pages, context)
  File "/usr/local/airflow/dags/rikolti/dags/shared_tasks/indexing_tasks.py", line 23, in index_collection_task
    raise e
  File "/usr/local/airflow/dags/rikolti/dags/shared_tasks/indexing_tasks.py", line 20, in index_collection_task
    index_collection(alias, collection_id, version_pages)
  File "/usr/local/airflow/dags/rikolti/record_indexer/index_collection.py", line 31, in index_collection
    delete_collection_records_from_index(collection_id, index, version_path)
  File "/usr/local/airflow/dags/rikolti/record_indexer/index_collection.py", line 164, in delete_collection_records_from_index
    r = delete_by_query(index, data)
  File "/usr/local/airflow/dags/rikolti/record_indexer/index_collection.py", line 95, in delete_by_query
    r.raise_for_status()
  File "/usr/local/airflow/.local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 409 Client Error: Conflict for url: https://search-rikolti-2-xxbcriyfw5iqysaj7p3fhhscae.us-west-2.es.amazonaws.com/rikolti-stg-2024-07-11-t15_35_50/_delete_by_query