Closed jbothma closed 1 year ago
Seen in logs after following docker-compose instrutions at https://www.opensanctions.org/docs/yente/deploy/
yente-tut-app-1 | 2023-06-27T10:16:35.283539Z [error ] Indexing error: BulkIndexError('37 document(s) failed to index.') [yente.search.indexer] dataset=default entities_url=https://data.opensanctions.org/datasets/20230627/default/entities.ftm.json index=yente-entities-default-00320230627063008 yente-tut-app-1 | Traceback (most recent call last): yente-tut-app-1 | File "/app/yente/search/indexer.py", line 115, in index_entities yente-tut-app-1 | await async_bulk(es, docs, yield_ok=False, stats_only=True, chunk_size=1000) yente-tut-app-1 | File "/venv/lib/python3.11/site-packages/elasticsearch/_async/helpers.py", line 337, in async_bulk yente-tut-app-1 | async for ok, item in async_streaming_bulk( yente-tut-app-1 | File "/venv/lib/python3.11/site-packages/elasticsearch/_async/helpers.py", line 252, in async_streaming_bulk yente-tut-app-1 | async for data, (ok, info) in azip( # type: ignore yente-tut-app-1 | File "/venv/lib/python3.11/site-packages/elasticsearch/_async/helpers.py", line 156, in azip yente-tut-app-1 | yield tuple([await x.__anext__() for x in aiters]) yente-tut-app-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ yente-tut-app-1 | File "/venv/lib/python3.11/site-packages/elasticsearch/_async/helpers.py", line 156, in <listcomp> yente-tut-app-1 | yield tuple([await x.__anext__() for x in aiters]) yente-tut-app-1 | ^^^^^^^^^^^^^^^^^^^ yente-tut-app-1 | File "/venv/lib/python3.11/site-packages/elasticsearch/_async/helpers.py", line 127, in _process_bulk_chunk yente-tut-app-1 | for item in gen: yente-tut-app-1 | File "/venv/lib/python3.11/site-packages/elasticsearch/helpers/actions.py", line 274, in _process_bulk_chunk_success yente-tut-app-1 | raise BulkIndexError(f"{len(errors)} document(s) failed to index.", errors) yente-tut-app-1 | elasticsearch.helpers.BulkIndexError: 37 document(s) failed to index.
matching still works fine
two indexes were loading at the same time, then I think the first completed and the second failed?
more logs from just before
yente-tut-app-1 | 2023-06-27T10:16:28.123512Z [info ] Index: 335000 entities... [yente.search.indexer] index=yente-entities-default-00320230627063008 yente-tut-app-1 | 2023-06-27T10:16:28.315509Z [info ] Index: 2157000 entities... [yente.search.indexer] index=yente-entities-default-00320230626183008 yente-tut-app-1 | 2023-06-27T10:16:29.612188Z [info ] Index: 2158000 entities... [yente.search.indexer] index=yente-entities-default-00320230626183008 yente-tut-app-1 | 2023-06-27T10:16:30.304867Z [info ] Index: 336000 entities... [yente.search.indexer] index=yente-entities-default-00320230627063008 yente-tut-app-1 | 2023-06-27T10:16:31.307114Z [info ] Index: 2159000 entities... [yente.search.indexer] index=yente-entities-default-00320230626183008 yente-tut-app-1 | 2023-06-27T10:16:32.216707Z [info ] Index: 337000 entities... [yente.search.indexer] index=yente-entities-default-00320230627063008 yente-tut-app-1 | 2023-06-27T10:16:32.329842Z [info ] Index is now aliased to: yente-entities [yente.search.indexer] index=yente-entities-default-00320230626183008 yente-tut-app-1 | 2023-06-27T10:16:32.339328Z [info ] Delete other index [yente.search.indexer] index=yente-entities-default-00320230627063008 index | {"@timestamp":"2023-06-27T10:16:32.341Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/S79Og3JaRmqF7Eh0Rzu-AQ] deleting index", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataDeleteIndexService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} yente-tut-app-1 | 2023-06-27T10:16:32.476391Z [info ] Index update complete. [yente.search.indexer] changed=True yente-tut-app-1 | 2023-06-27T10:16:32.476725Z [info ] Closing elasticsearch client [yente.search.base] yente-tut-app-1 | 2023-06-27T10:16:34.133746Z [info ] Index: 338000 entities... [yente.search.indexer] index=yente-entities-default-00320230627063008 index | {"@timestamp":"2023-06-27T10:16:34.184Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008] creating index, cause [auto(bulk api)], templates [], shards [1]/[1]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataCreateIndexService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.528Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] create_mapping", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.579Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.626Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.688Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.730Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.778Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.818Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.863Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:34.903Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:35.069Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:35.140Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} index | {"@timestamp":"2023-06-27T10:16:35.184Z", "log.level": "INFO", "message":"[yente-entities-default-00320230627063008/PhUtJZRXS_W4tl_kC81NxA] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[index][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"hOqHptlAT-ezXFwQK84TfQ","elasticsearch.node.id":"ZrtQG-U2S1miZBgej2ZZYQ","elasticsearch.node.name":"index","elasticsearch.cluster.name":"opensanctions-index"} yente-tut-app-1 | 2023-06-27T10:16:35.283539Z [error ] Indexing error: BulkIndexError('37 document(s) failed to index.') [yente.search.indexer] dataset=default entities_url=https://data.opensanctions.org/datasets/20230627/default/entities.ftm.json index=yente-entities-default-00320230627063008 yente-tut-app-1 | Traceback (most recent call last): yente-tut-app-1 | File "/app/yente/search/indexer.py", line 115, in index_entities yente-tut-app-1 | await async_bulk(es, docs, yield_ok=False, stats_only=True, chunk_size=1000) yente-tut-app-1 | File "/venv/lib/python3.11/site-packages/elasticsearch/_async/helpers.py", line 337, in async_bulk yente-tut-app-1 | async for ok, item in async_streaming_bulk(
will keep it running and see if the next update indexes or also has an issue
it later tried again and successfully indexed the new dataset.
Nice catch on this :)
Seen in logs after following docker-compose instrutions at https://www.opensanctions.org/docs/yente/deploy/
matching still works fine
two indexes were loading at the same time, then I think the first completed and the second failed?
more logs from just before