CDLUC3 / ezid

CDLUC3 ezid
MIT License
11 stars 4 forks source link

[MAINTENANCE] Refactor SearchIdentifier table and OpenSearch to reduce duplication #655

Open adambuttrick opened 2 weeks ago

adambuttrick commented 2 weeks ago

Describe the current state/issue

As described in https://github.com/CDLUC3/ezid/issues/640, our current OpenSearch implementation writes to both the SearchIdentifier table and the OpenSearch index.

Describe the desired state/solution

Eventually, we should refactor away, or otherwise reduce the scope of the SearchIdentifier table. To do so, we would need to either unify the data populated by the link checker and modified dates back into the Identifier table or limit SearchIdentifier to only include these or other necessary fields. These values must be preserved to recreate the OpenSearch index in the event of failure. Recreating relative to the table revision would also involve updating our reindexing logic/script relative to the revised table's final state, synthesizing data from both it and the Identifier table.

Additional notes

Our ability to undertake the above is dependent on the state of production monitoring for OpenSearch. We should act conservatively with regard to table changes until this is fully implemented. This will give us the most latitude with regard to falling back to the previous search implementation in the event of failure or unforeseen performance issues in production.