FelixKrueger / TrimGalore

A wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data
GNU General Public License v3.0
461 stars 150 forks source link

TrimGalore with 10X data #95

Closed golharam closed 4 years ago

golharam commented 4 years ago

Hi - I'm curious as to the best way to run Trim Galore on 10X scRNA-Seq data. We have sequenced degraded FFPE samples and are seeing the TSO adapter at the start of read2 and in some reads poly-A tails at the end of read 2. I'd like to trim these reads of both the TSO adapter and the poly-A tail. I tried to use R2 as the only input, but this discarded some of the reads, but the paired R1 is still there. So, I tried to then supply both ends using --Paired, but that's only trimming R1, which I don't want to do. Is there a way to apply TrimGalore to just 1 end, but supplying both ends in case any reads need to be discarded?

FelixKrueger commented 4 years ago

I'm afraid there is currently no way to supply both reads but only trim one of them. The next best way I guess would be to only use Read 2 for trimming with e.g. -a A{20} while using --length 0. This would trim reads for PolyA from the 3' end, but not remove sequences if they get too short. I am not sure how CellRanger deals with very short (or 0 length?) reads, but you might have to use a little script that takes in R1 and R2 at the same time, and apply some length filtering, similar to the step 'validate paired-ends' that Trim Galore carries out internally. Does that help?

golharam commented 4 years ago

It does. I'd like to keep this issue open. I may continue back a PR to enable TrimGalore to work on 10X data.

golharam commented 4 years ago

Looking deeper into this, cutadapt has parameters to handle this, namely -p, -A and -G.

lcolladotor commented 3 years ago

Hi,

We are seeing some of the TSO adapter on read2 as well. From https://teichlab.github.io/scg_lib_structs/methods_html/10xChromium3.html that's AAGCAGTGGTATCAACGCAGAGTACAT. I'm just curious if you could post the full command you used for trimming the TSO adapter from read 2.

Thank you!

Best, Leo

FelixKrueger commented 3 years ago

I suppose this is probably one for @golharam to answer? Maybe you could even post it here for future reference? Best, Felix

lcolladotor commented 3 years ago
Screen Shot 2021-04-23 at 10 26 30 AM

For reference, here you can see that sequence on the FastQC report for read2, at the "overrepresented sequences" section


Yes Felix, I was asking @golharam =) Thanks for the fast reply!

golharam commented 3 years ago

I don't remember the parameters I used. I ended up using cut adapt directly, but I don't think that solved the issue either. In the end, a newer version of CellRanger trimmed adapters better than any way I tried so we went this using a newer version of CellRanger.