DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

Rare `KeyError` while reading aggregates #3384

Open jessebrennan opened 3 years ago

jessebrennan commented 3 years ago
[WARNING] 2021-08-27T17:00:03.486Z 84490e63-5d43-5a06-b1bb-57c04110efe6 Failed to aggregate tallies: dict_values([[DocumentTally(entity=CataloguedEntityReference(entity_type='files', entity_id='c55da389-1589-49f2-b748-8974bf39fcb6', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='files', entity_id='f0a547f1-c4a6-4834-bc7b-ad8ef49b8f90', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='cell_suspensions', entity_id='2b12c961-33f0-496d-a4c2-ecacbcdc1087', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='samples', entity_id='00a5be43-cc47-49d1-ae78-8ded0a3ad18f', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='bundles', entity_id='1b163269-4e17-4376-89b9-e2b0ea9954b3', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='files', entity_id='58d289bd-a4d0-4ed6-bfd8-1c992da7ce1e', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='files', entity_id='e97b5980-8d41-4bd2-999a-2fd81ac79a20', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='bundles', entity_id='8886f59c-dbe9-4431-8aa3-2e39d95d1d5a', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='files', entity_id='0610eb4a-1f87-49cb-90ad-ccc57c43a8f4', catalog='dcp8'), num_contributions=1, attempts=1)], [DocumentTally(entity=CataloguedEntityReference(entity_type='files', entity_id='53738b66-334d-4250-8ab7-030b26035c18', catalog='dcp8'), num_contributions=1, attempts=1)]])

Traceback (most recent call last):
File "/var/task/azul/indexer/index_controller.py", line 229, in aggregate
self.index_service.aggregate(tallies)
File "/var/task/azul/indexer/index_service.py", line 236, in aggregate
old_aggregates = self._read_aggregates(tallies)
File "/var/task/azul/indexer/index_service.py", line 307, in _read_aggregates
return {a.coordinates.entity: a for a in aggregates()}
File "/var/task/azul/indexer/index_service.py", line 307, in <dictcomp>
return {a.coordinates.entity: a for a in aggregates()}
File "/var/task/azul/indexer/index_service.py", line 299, in aggregates
if doc['found']:
KeyError: 'found'
melainalegaspi commented 3 years ago

Spike to confer with ES team about whether this is expected.

jessebrennan commented 3 years ago

https://discuss.elastic.co/t/rarely-the-found-key-is-missing-from-the-multi-get-response/283584?u=jesseb

jessebrennan commented 3 years ago

Once we fix #3312, we will be able to search through the logs and find the response.