plagnollab / RNASeq_pipeline

Set of scripts for RNA-Seq data processing
2 stars 2 forks source link

Creating a QC step for Version 8 #8

Closed jackhump closed 7 years ago

jackhump commented 8 years ago

Hi guys, Adapters in RNA-seq data are a real pain and so I think the pipeline should have a step that detects and removes the adapters.

trim_galore is pretty fast at detecting adapters and removing them. One idea we had was to run each file through fastqc, parse the output to find if adapters are present (Warning or Fail in the fastqc adapter report) and then trim those offending files. I decided that this was too fiddly and came up with an alternate strategy

I've included a flag in the support file called "QC". If QC=yes then all the fastqs are run through trim_galore.

But that would need a way for the alignment step to wait until all the trimmed files are made. Any ideas Vincent?

jackhump commented 8 years ago

The other thing is would we want to quality trim as well? Trim_galore can remove bases below a particular quality score (default is 20) if told to.

pontikos commented 8 years ago

I think that's a good idea, I'm in favour of a "fire and forget" pipeline so we could have the fastqc run first, which provides useful stats anyway, then do trimming if necessary? What was your commit number Jack?

jackhump commented 8 years ago

I haven't committed my changes yet. I think having the Fastqc readouts would be useful.

On 26 Nov 2015 20:30, Nikolas Pontikos notifications@github.com wrote:

I think that's a good idea, I'm in favour of a "fire and forget" pipeline so we could have the fastqc run first, which provides useful stats anyway, then do trimming if necessary? What was your commit number Jack?

Reply to this email directly or view it on GitHubhttps://github.com/plagnollab/RNASeq_pipeline/issues/8#issuecomment-159990554.