hubmapconsortium / search-api

HuBMAP search service and associated pieces to create an index
https://search.api.hubmapconsortium.org
MIT License
2 stars 2 forks source link

Troubleshooting Elasticsearch issue with TEST indices #825

Closed yuanzhou closed 1 month ago

yuanzhou commented 2 months ago

All of sudden the full reindex on TEST failed to write documents to the hm_test_ indices, only a small number of documents were indexed initially. The search-api logging indicates that the indexing process is still in progress but there's no update on the counts in OpenSearch console, even after a few hours. And I portal-ui on TEST was basically useless.

Screenshot 2024-06-27 at 11 24 44 PM Screenshot 2024-06-27 at 11 26 02 PM

I tried with the following but still having the same issue:

None of them made a difference.

Screenshot 2024-06-27 at 11 11 48 PM
yuanzhou commented 2 months ago

Also tried the following:

yuanzhou commented 2 months ago

I submitted a help ticket and chatted with the AWS tech support, we made configuration updates to bring the cluster from Yellow to Green. The internal team verified that the cluster's data nodes are fine and there are no unassigned shards.

I did further investigation and debugging to rule out any causes on my end, and I did finally figured out the root cause. It was a BAD data in our database, which caused infinite loop... That also explained why a small number of documents got indexed and after that no more documents added to the Elasticsearch indices.

Dataset 421007293469db7b528ce6478c00348d has itself as parent and this caused the index procedure to endlessly loop through this node and would never get to other entities.

Screenshot 2024-06-28 at 8 03 53 PM

I deleted the Activity node (5987bb5d5b7783878448fc4cf3150634) and the input/output relationships. Also recreated with using the correct director ancestor, which is Sample ee5c22a10c313e58fbfbd11aa2892cf6.