Creating a QC step for Version 8

jackhump commented 8 years ago

Hi guys, Adapters in RNA-seq data are a real pain and so I think the pipeline should have a step that detects and removes the adapters.

trim_galore is pretty fast at detecting adapters and removing them. One idea we had was to run each file through fastqc, parse the output to find if adapters are present (Warning or Fail in the fastqc adapter report) and then trim those offending files. I decided that this was too fiddly and came up with an alternate strategy

I've included a flag in the support file called "QC". If QC=yes then all the fastqs are run through trim_galore.

single end data are trimmed individually
pair-end data are trimmed in pairs The current thing I've written is to incorporate the trimming step in the alignment step (Step1a) so that trimming happens sequentially before each fastq (or pair of fastqs) gets aligned. This is not necessarily the best approach as we could trim beforehand in parallel through multiple jobs.

But that would need a way for the alignment step to wait until all the trimmed files are made. Any ideas Vincent?

jackhump commented 8 years ago

The other thing is would we want to quality trim as well? Trim_galore can remove bases below a particular quality score (default is 20) if told to.

pontikos commented 8 years ago

I think that's a good idea, I'm in favour of a "fire and forget" pipeline so we could have the fastqc run first, which provides useful stats anyway, then do trimming if necessary? What was your commit number Jack?

jackhump commented 8 years ago

I haven't committed my changes yet. I think having the Fastqc readouts would be useful.

On 26 Nov 2015 20:30, Nikolas Pontikos notifications@github.com wrote:

I think that's a good idea, I'm in favour of a "fire and forget" pipeline so we could have the fastqc run first, which provides useful stats anyway, then do trimming if necessary? What was your commit number Jack?

Reply to this email directly or view it on GitHubhttps://github.com/plagnollab/RNASeq_pipeline/issues/8#issuecomment-159990554.

plagnollab / RNASeq_pipeline

Creating a QC step for Version 8 #8