loculus-project / loculus

An open-source software package to power microbial genomic databases
https://loculus.org
GNU Affero General Public License v3.0
37 stars 2 forks source link

fix(backend): delete processed entries as well when deleting unreleased sequence entries #3252

Open corneliusroemer opened 2 days ago

corneliusroemer commented 2 days ago

Fixes #3250

preview URL: https://delete-processed.loculus.org

Generated partially with help from Copilot, see #3251

Summary

When deleting entries from sequence_entries table, we also now delete from sequence_entries_preprocessed_data table.

Followup

We should still add foreign key constraints - all sorts of race conditions are still possible until then, but this is better than nothing.

Testing

I've manually tested that if I delete something from sequence_entries it also disappears from sequence_entries_preprocessed_data.

I've looked into whether we can easily test programmatically that rows have disappeared from processed data but I couldn't find good ways of doing this, we don't seem to be doing anything like that so far, or maybe I missed it. @fengelniederhammer any ideas on how to assert that after deletion there are no more corresponding entries in processed data table? I guess foreign key constraints are the best way to ensure, then almost no need to test at all.

Screenshot

PR Checklist

anna-parker commented 2 days ago

It might be easier to write a test that the explicit edge case that we ran into does not happen - e.g. we have GIVEN preprocessing pipeline version 2 WHEN deleting all sequences THEN can start with version 1 again we should also add GIVEN preprocessing pipeline version 1 WHEN some sequences deleted THEN can update pipeline to version2 -> I just created a test for this - I will add in another branch

fengelniederhammer commented 2 days ago

I've looked into whether we can easily test programmatically that rows have disappeared from processed data but I couldn't find good ways of doing this, we don't seem to be doing anything like that so far, or maybe I missed it. @fengelniederhammer any ideas on how to assert that after deletion there are no more corresponding entries in processed data table? I guess foreign key constraints are the best way to ensure, then almost no need to test at all.

Our test concept in the backend is that we only test what we get from the endpoints. That doesn't quite cover what you want to test. I would try to go in the direction that @anna-parker proposed and come up with a scenario where you will see from the outside that it worked.

chaoran-chen commented 2 days ago

Thanks for identifying the bug! Should we maybe just add a foreign key now instead? Should be similar easy and I'm happy to create a PR.