AtlasOfLivingAustralia / specieslist-webapp

Species lists and traits tool
https://lists.ala.org.au
Mozilla Public License 2.0
6 stars 21 forks source link

Failure when rematching a large number of lists #304

Closed adam-collins closed 1 month ago

adam-collins commented 4 months ago

When using the admin button rematch all, it can fail on smaller machines when there are a large number of lists. This should not happen. There is a workaround implemented but this workaround is not ideal as it involves restarting tomcat.

During a rematch all, existing name matches are first removed, then items are iterated, with batches sent to namematching service.

For this issue, make changes to rematch all.

qifeng-bai commented 3 months ago

288

qifeng-bai commented 3 months ago

Q: Iterate over lists before individual list items. This is instead of iterating through list items only. A: nameExplorerService iterates over species items, not lists

Q:Perform an update to existing matches only after a diff. This is instead of the removal of matches at the beginning of the rematch all. A: Got answer from Simon Sherrin and Mahmoud: the taxonConceptID is changed when we updated the Taxonomic Backbone, yes.The general rule is - you can't rely on taxonConceptIDs to be persistent across updates to the Taxonomic Backbone. (BIE reindex)

In this case, the taxonConceptID will be changed anyway. When we are rematching, we don't have to do a diff, just need to update directly