Closed matrulda closed 6 years ago
@MatildaAslin I suggest that we keep the exclusive check against the lower bound, but we change the interval to correspond to the number of cycles instead of read length. This mean that we will do a comparison against the value specified in runInfo.xml without doing any "corrections/calculations". Using the value specified in runInfo.xml, without doing modifications first, might help other users to understand our application better.
I think @monikaBrandts suggestion is a good one. Does either of you want to have a look at fixing this? And perhaps also clarify in the docs that it is number of cycles that we use to define the criteria?
I also think it would be great to catch the KeyError that turns up here and make it in to a more specific and descriptive exception to help the user understand what is going on.
@monikaBrandt @johandahlberg Great suggestions! Stay tuned for a PR.
I found a bug when processing a
hiseq2500_rapidhighoutput_v4
with read length50
.ERROR:
Config:
I think the problem is that we are defining the read length as number of cycles - 1: https://github.com/Molmed/checkQC/blob/8911c2844b817bd85a9dd4c7d788b8d835ff8dd3/checkQC/run_type_recognizer.py#L221
and then doing an exclusive check against the lower bound: https://github.com/Molmed/checkQC/blob/8911c2844b817bd85a9dd4c7d788b8d835ff8dd3/checkQC/config.py#L94
So in my case: 51 cycles were run for read1, it is adjusted to read length 50. Then we check if 50 < 50, which is not true.
This could be fixed by not defining read length as number of cycles minus one or by doing an inclusive check against the lower bound. What do you think is the best, @johandahlberg @monikaBrandt ?