Fastqc or other fastq checks?

At the moment, we don't run any pre-processing checks on fastq data, whether downloaded from SRA or provided locally, and the only preprocessing we do is adaptor trimming with fastp.

Conceivably we could add additional fastq QC checks to the preprocessing steps, but it is not clear how necessary or useful these are. Most data issues (data from the wrong species or otherwise contaminated, bad sequence quality) will be readily detectable by mapping problems, and it may be simpler and more robust to leave QC checks to that stage.

On the other hand, something like fastqc and a quick kmer coverage plot is easy to generate and could be a useful diagnostic in situations where there are problems.

Thoughts?

harvardinformatics / snpArcher

Fastqc or other fastq checks? #14