loculus-project / loculus

An open-source software package to power microbial genomic databases
https://loculus.org
GNU Affero General Public License v3.0
37 stars 2 forks source link

Stop prepro from getting into an infinite loop #2570

Open anna-parker opened 3 months ago

anna-parker commented 3 months ago

As seen with https://github.com/loculus-project/loculus/issues/2562 if the prepro pipeline doesn't clean up a sequence or metadata as expected and sends this to the backend the backend will reject the submission but the prepro pipeline will retry sending in a loop. This means that users see sequences are stuck in a state of preprocessing.

Ideally the prepro pipeline should always submit data in the way expected by the backend but we should have another way to stop the prepro loop if after 5minutes or so the prepro pipeline has not succeeded in uploading preprocessed data to the backend. Then we could let users know they need to contact us to find out what happened and what is wrong with their uploaded data.

rneher commented 3 months ago

I understand that the pipeline is batching and rejects the entire batch if something bad happens.

What about submitting sequences of a failed batch one-by-one. For sequence that fails when run by itself, don't retry but report error to the user.