loculus-project / loculus

An open-source software package to power microbial genomic databases
https://loculus.org
GNU Affero General Public License v3.0
36 stars 2 forks source link

Automated flagging of dubious sequences #2346

Open theosanderson opened 3 months ago

theosanderson commented 3 months ago

We want to be able to flag questionable sequences

          We also may not have captured in issues @emmahodcroft's (and the rest of us)'s concept around flagging "dubious" sequences

Originally posted by @theosanderson in https://github.com/loculus-project/loculus/issues/1272#issuecomment-1986219631

After handling manual flagging of sequences in: https://github.com/loculus-project/loculus/issues/813 we should move to this more automated flagging, which would probably come from the preprocessing pipeline

emmahodcroft commented 3 months ago

Idea here is if a sequence is suspected to be, ex:

One cannot 'fix' it, but should be able to curate-flag it so it is not included in searches by defaults, and shows up as somehow 'dubious' when people view sequences