scientist-softserv / adventist_knapsack

Apache License 2.0
2 stars 0 forks source link

Spike: nested indexing verification #338

Open jillpe opened 1 year ago

jillpe commented 1 year ago

Summary

https://assaydepot.slack.com/archives/C0311DN2YCA/p1691511731241009

While testing the jobs to clean & reimport PDFs, it looks like Adventist may be doing nested indexing when you run work.child_works&.each { |child| child.destroy(eradicate: true) } (There were references to the nested indexer in the logging, and it was very slow).

It seems to be a bug in the graph indexer that could become more urgent as we work to clean up their data in the future… there is a LOT of data that needs to be cleaned up in ADL tenant.

Also not sure how to verify that it’s wrong, but think the references to the nested indexer indicate it is a problem

How Jeremy implemented graph indexing:

The v2 graph indexer was looking at the backport of the Hyrax v3 graph indexer; so it may well have missed at least one case. However, it could be logging things but not doing much. Hard to say without debugging that behavior with the graph indexer

Accepted Criteria