Trimgalore seems to detect only a very low % of reads in the files that had adapters to trim (<40% of the reads)

dthirupp commented 5 months ago

Hi.

I have been using Trimgalore as a tool for trimming adapters from my Nextera-XT library constructed samples that are sequenced in PE 150 mode. I normally use the default settings and proceed, trusting the results especially the fastqc reports for many of the files look good, however this time I noticed that under the summary section of the Trimgalore command, I see that the Reads with adapters is around 36-40% of the total reads in the fastq file. I don't know if this is supposed to be the case or not, but I find it a little strange because I don't see how >50% of the reads in the file had no adapter in the first place (and if they didn't, how would they have been sequenced so many times?). I'm not sure if Im misunderstanding some critical component regarding the sequencing of the samples itself, so apologies if this is to be expected. I have attached a screengrab of this summary for one of the samples:

command run: trim_galore -q 20 -fastqc -j 8 Screenshot from 2024-03-07 17-55-18

dthirupp commented 5 months ago

okay so a quick update and I dont know why I didn't think of this first - I realized I have been using an old version of cutadapt (2.8) and of Trimgalore (0.6.4_dev) (I have a loop that runs this and have been using it ever since I made it a while ago!)- Is it possible this could have anything to do with this issue? Thanks again!

FelixKrueger commented 5 months ago

I am not exactly sure I really understand what you are worried about. Would you like your files to contain more adapter contamination that it actually has?

If you run FastQC on the data to start with, and look at the adapter contamination plot, I would expect to see a fairly low amount of (Nextera adapter), can you confirm that?

dthirupp commented 5 months ago

Hi Felix.

I think I've just realized that I've been misunderstanding this process all along. All this while, I believed the sequencing would always sequence a portion of my adapter (the index as well as part of the region that is used to adhere the read to the flow cell), meaning that all successfully sequenced reads would have some adapter in them that needs to be trimmed (or at least, it helps to trim). I think now I realize that the sequencing at the 5' end starts after the adapter portion and, unless I have a short fragment, I should get little to no adapters from the 3' end in the sequenced reads.

Sorry I jumped the gun on this, but if it helps, maybe you can confirm if this understanding is correct. And yes you're right, my fastqc reports show a low level of adapter content (so it passes).

Thanks for your response.

Best, Deepan

On Fri, Mar 8, 2024, 3:09 AM Felix Krueger @.***> wrote:

I am not exactly sure I really understand what you are worried about. Would you like your files to contain more adapter contamination that it actually has?

If you run FastQC on the data to start with, and look at the adapter contamination plot, I would expect to see a fairly low amount of (Nextera adapter), can you confirm that?

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/FelixKrueger/TrimGalore/issues/187*issuecomment-1985505004__;Iw!!Mih3wA!C8piXHYtJfpTT-VYNvCI2NFJ_gxE58KNrT-Xu3RkaLgHYvWpo4PLxoaa1OyUHxNltH554CoNjesE8Cc6j5_4a7u8yw$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ATTL2AJSXVES236RY2SGANDYXGMALAVCNFSM6AAAAABEMBPMR6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBVGUYDKMBQGQ__;!!Mih3wA!C8piXHYtJfpTT-VYNvCI2NFJ_gxE58KNrT-Xu3RkaLgHYvWpo4PLxoaa1OyUHxNltH554CoNjesE8Cc6j5_9hd7VXA$ . You are receiving this because you authored the thread.Message ID: @.***>

FelixKrueger commented 5 months ago

That is exactly correct. So we can just conclude this, wishing you well for the whatever comes next!

FelixKrueger / TrimGalore

Trimgalore seems to detect only a very low % of reads in the files that had adapters to trim (<40% of the reads) #187