loculus-project / loculus

An open-source software package to power microbial genomic databases
https://loculus.org
GNU Affero General Public License v3.0
37 stars 2 forks source link

How to maintain curation changes across ingest revisions #3085

Open corneliusroemer opened 3 weeks ago

corneliusroemer commented 3 weeks ago

If we curate Genbank sequences, we have to deal with the following problem:

One solution is to have a patch file that specifies a sequence (not version) and the curation changes applied to it.

When ingest revises a curated sequence, a curation bot could then reapply the patch file (potentially after curator approval).

anna-parker commented 2 weeks ago

I've started working on handling this programmatically in ingest: https://github.com/loculus-project/loculus/pull/3112.

I do this by having ingest calculate the diff between versions and keep the diff that is from a curator. However, this is quite inefficient and I realized I also need to do the same to check if the sequence has changed.