loculus-project / loculus

An open-source software package to power microbial genomic databases
https://loculus.org
GNU Affero General Public License v3.0
37 stars 2 forks source link

When deleting unreleased sequence entries, the entry remains in processed data table blocking pipeline upgrades #3250

Open corneliusroemer opened 2 days ago

corneliusroemer commented 2 days ago

I bumped processing pipeline version on virus2 to v2 and noticed the backend didn't upgrade. Inspecting the db showed that 15 sequences weren't processed by v2, these also weren't in processing or anything.

Inspection of the audit logs showed that the 15 sequences that were not getting processed had been deleted by a user while they were not yet released.

Because we don't properly use foreign key constraints (we really should, see #3201) it's possible that sequence entries disappear but the processed data remains without the db complaining (the db would have made us notice this bug much earlier and probably more bugs like that).

The fix is probably to delete all processed rows associated with an unreleased deleted entry. There might well be edge cases that will remain tricky, and until we have foreign key constraints I won't trust us to be bug free there.

We haven't noticed this because we never tried to upgrade pipeline version after someone had deleted something - kind of hard to test using our current setup. That's why foreign key constraints would be so useful. @chaoran-chen likely thought there were such constraints in place, at least in terms of the outcome of deletions - hence the case of entries are deleted, processed data is still there wasn't considered.