Open nickp60 opened 1 year ago
I like the idea for an entropy filter or something along those lines! I also wonder - could it be worth codifying some of these QC issues into a filter on snake make where conditional on various metrics it either finishes by raising a "problematic sequencing file, send for re-seqeuncing" flag or continues with the rest of the pre-processing. I think we might be able to do this with conditional execution on snakemake, though this sin't something I have done before!
Not sure what threshold to use, be we are currently seeing a small proportion of very low complexity reads (AGAGAGAGAGAGAGA, runs of T's, etc).