Read Quality Downstream Analysis

Story

Consistency of read quality has vastly improved over time. Now it is standard to get reads that have an extremely high quality score all the way throughout, while several years ago it was common to have read quality decrease towards the end of a read. As part of the mapping pipeline I am quality trimming -q 20, so these differences should not affect overall results. However, it would be good to look at overall read qualities and see how they differ among samples.

Questions and Tasks

[ ] Plot distribution of read quality.
- FASTQC calculates the distribution of per base scores. Just average across deciles and plot the distribution of deciles across samples.
[ ] Do runs within an SRX have a similar quality score?
[ ] Is there a clear cutoff (fraction of bases) that should be implemented?
[ ] What is the worst SRR
[ ] What is the worst SRX?

Definition of done

[ ] Distribution plot of quality scores.
[ ] Cutoff criteria if any.
[ ] Table with any flags.

jfear / ncbi_remap

Read Quality Downstream Analysis #46

Story

Questions and Tasks

Definition of done