usadellab / Trimmomatic

Other
214 stars 70 forks source link

Trimmomatic seems not trimming adapters for short reads (illumina GAIIx dataset) #10

Closed Hughes-Shin closed 3 years ago

Hughes-Shin commented 3 years ago

Hi, I'm trying to remove adapter sequences in fastq generated from illumina GAIIx platform (small RNA-Seq dataset).

I used adapter sequence "CTGTAGGCACCATCAATCGTATGCCGTCTTCTGCTTG" and shorter version "CTGTAGGCACCATCAA". Below is my commandline for trimmomatic (ver : 0.39) : "java -jar trimmomatic-0.39.jar SE -threads 20 -phred33 ./{input}.fastq.gz ./{input}_trim.fastq.gz ILLUMINACLIP:./user_specified_adapter.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 AVGQUAL:30" , and then do size-selection with cutadapt (18~26-nt reads)

Results from Trimmomatic process are below: Input Reads: 22605850 Surviving: 19716093 (87.22%) Dropped: 2889757 (12.78%) (When using longer adapter sequence) Input Reads: 22605850 Surviving: 21460433 (94.93%) Dropped: 1145417 (5.07%) (When using short adapter sequence)

Most of survived reads were 36-nt-length reads containing partial/full adapter sequences, which were removed by cutadapt length option (-M 26). Data processed with short version of adapter showed similar results (not shown) Total reads processed: 19,716,093 Reads with adapters: 0 (0.0%) Reads that were too short: 1,374,994 (7.0%) Reads that were too long: 17,387,916 (88.2%) -> mostly ~36-nt reads containing adapters. Reads written (passing filters): 953,183 (4.8%)

When using cutadapt for adapter trimming plus size-selection (18~26nt) : Total reads processed: 22,605,850 Reads with adapters: 22,187,182 (98.1%) Reads that were too short: 3,215,136 (14.2%) Reads that were too long: 908,954 (4.0%) Reads written (passing filters): 18,481,760 (81.8%)

When I used Trimmomatic for illumina 51-cycle single-end reads, there were no big differences between results from Trimmomatic and Cutadapt...

Input : 11,135,969 Trimmomatic (adapter-trimming + AVGQUAL:30) + Cutadapt (only size selection) : 8,456,739 Cutadapt (adapter-trimming + size selection) + Trimmomatic (AVGQUAL:30 only) : 8,605,374

How can I optimize trimmomatic option for 36-cy illumina single-end reads for adapter trimming?

TonyBolger commented 3 years ago

Can you attach the user_specified_adapter.fa file and a short FASTQ file with some reads which should get trimmed, but don't?

Hughes-Shin commented 3 years ago

The content of "user_specified_adapter.fa" is below :

"user_specified_adapter.fa"

user-specified-adapter_CTGTAGGCACCATCAATCGTATGCCGTCTTCTGCTTG CTGTAGGCACCATCAATCGTATGCCGTCTTCTGCTTG (end)

Adapter sequence was replaced when using shorter version of adapter sequence.

Instead of short fastq, below is the parsing result of trimmomatic's trimlog. (only perform adapter clipping without doing Leading/Trailing/Slidingwindow/Avgqual option) Maybe I should change

_trimmed_bases readcount

0 20283264 17 560759 18 339863 19 208494 20 174121 21 125735 22 114784 23 123354 24 136893 25 145448 26 151521 27 119849 28 64033 29 29629 30 12104 31 4495 32 2182 33 2007 34 1814 35 5501

GCAT01 commented 2 years ago

Not to resurrect a dead issue but I am having the same problem. I am trying to sequence short (50bp) RNASeq reads from a published paper. However, when I run Trimmomatic, it doesn't remove adapter sequences. It FINDS them, but doesn't remove them.

I've looked all over the internet for answers but most of the threads immediately suggest to use a different trimmer.

Hughes-Shin commented 2 years ago

Not to resurrect a dead issue but I am having the same problem. I am trying to sequence short (50bp) RNASeq reads from a published paper. However, when I run Trimmomatic, it doesn't remove adapter sequences. It FINDS them, but doesn't remove them.

I've looked all over the internet for answers but most of the threads immediately suggest to use a different trimmer.

Thanks for sharing your experience.

However, I've succeeded at trimming adapters from 51-cycle RNA-Seq, which contains small RNAs (10~30nt). The case that I wrote above is 36 cycles, and I've done trimming with Trimmomatic at last.

Too many days were gone from when I questioned, so for now what I remember about the reason is that some Trimmomatic options should be changed according to read length. Maybe you could try by changing numbers of "ILLUMINACLIP" option or "SLIDINGWINDOW".