ShawHahnLab / umbra

Python package and executable for Linux for managing Illumina sequencing runs
GNU Affero General Public License v3.0
3 stars 0 forks source link

Failed alignments should be handled #91

Open ressy opened 4 years ago

ressy commented 4 years ago

Currently an alignment directory with incomplete output from a processing failure on the sequencer triggers a repeating "Alignment not recognized" error for the run, but that's only because the wrong ValueError is inadvertently caught. The Checkpoint.txt file for these cases is both an integer and a keyword (and the keyword gets included when trying to cast as int in umbra.illumna.util.load_checkpoint).

In these cases Checkpoint.txt looks like:

0
Demultiplexing

instead of:

3

(So there's an empty string in the usual case, and presumably other integers and keywords for the intermediates but I haven't seen them.)

Instead, load_checkpoint should be updated to get both the integer and the keyword, and any error entries in CompletedJobInfo.xml should be noted.

ressy commented 4 years ago

The examples I can find are:

0
Demultiplexing
1
Generating FASTQ Files
3

Whatever step 2 is it must happen very reliably.

ressy commented 4 years ago

The bug aspect of this was fixed in #97 and the error parsing implemented in #104. The last step is to add a property summarizing current state (e.g. incomplete, errored, complete) and add it to the processor reporting.