Closed alexdunnjpl closed 5 months ago
Benchmarking against sbnpsi results in speed-up from 5m30s to 4m20s due to inherent speed improvements, but sbnpsi only has ~250 non-singleton products out of 1.5M total.
Results are likely to be significantly more impressive when it's actually avoiding a significant quantity of avoidable db writes .
Rebased on #100 - consider only commit 3f94d339edc53d38098b01117c291655bab90266
🗒️ Summary
Implements #92
Modifies behaviour in that now, the latest version of a product will be assigned
"ops:Provenance/ops:superseded_by": null
rather than not having the attribute assigned at all.Implements software-version-based reprocessing avoidance, as already exists for repairkit and ancestry.
Reads all documents, builds version chains for distinct LIDs, drops all singleton products (as no links exist), builds links, tainting any products with changed successor data, then produces updates, skipping up-to-date records unless they have been tainted.
⚙️ Test Data and/or Report
Functional tests pass, but none are relevant to provenance, per #13 Manually tested, comparing updates produced before/after change.
♻️ Related Issues
fixes #92