Open lucygarner opened 2 years ago
Is this correct? Do I want to add any further sequences?
PrefixPE/1 ACACTCTTTCCCTACACGACGCTCTTCCGATCT PrefixPE/2 TGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
The sequences NEB provide are essentially the same, swapped and in the opposite orientation. Adapter read-through is detected in Trimmomatic not by directly finding these specific sequences in the reads, it's done by comparing a 'prefixed' version of the forward and reverse read against each other (with reverse complement applied). What you need to provide to Trimmomatic are these 'prefix sequences'.
I'm not sure why the NEB sequences are one base shorter - perhaps their library prep creates a different base than normal at that position. If so, it might be marginally beneficial to shorten the prefix, as you suggest, but i would not expect a dramatic difference.
The additional sequences in the PE-2 file are only needed there was some blunt ligation happening during library prep, which happens if the library kit is degraded, e.g. lacking the Y structure or the A overhang. Clean libraries prepped with modern kits rarely have this issue, but sometimes people need to work with old data, so they're still included.
Thank you for the thorough answer. Is there any harm in including the extra sequences in the adapter FASTA with new data?
My current adapter file is as follows:
PrefixPE/1 ACACTCTTTCCCTACACGACGCTCTTCCGATCT PrefixPE/2 TGACTGGAGTTCAGACGTGTGCTCTTCCGATCT PCR_Primer1 ACACTCTTTCCCTACACGACGCTCTTCCGATCT PCR_Primer1_rc AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT PCR_Primer2 TGACTGGAGTTCAGACGTGTGCTCTTCCGATCT PCR_Primer2_rc AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
Does this seem reasonable? As a reminder, the sequences provided by NEB are as follows: Adaptor Read1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCA Adaptor Read2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
@lucygarner Hello, I have exactly the same issue and wonder if you could update how you solved it, or moved on with the adapter trim?
I used this for my adapter file in the end:
PrefixPE/1 ACACTCTTTCCCTACACGACGCTCTTCCGATCT PrefixPE/2 TGACTGGAGTTCAGACGTGTGCTCTTCCGATCT PCR_Primer1 ACACTCTTTCCCTACACGACGCTCTTCCGATCT PCR_Primer1_rc AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT PCR_Primer2 TGACTGGAGTTCAGACGTGTGCTCTTCCGATCT PCR_Primer2_rc AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
I hope it's correct!
@lucygarner Hi thanks. I was only able to remove the reverse read adapters, yet not in the forward read (maybe only 10% removal). Did you have similar issues?
I didn't actually have much adapter contamination, so it was hard to test thoroughly. I would be interested to get @TonyBolger's thoughts.
@lucygarner Hi thanks. I was only able to remove the reverse read adapters, yet not in the forward read (maybe only 10% removal). Did you have similar issues?
@Dahn-YoungDong, what adapter sequence did you have in the forward read. Is it the expected "AGATCGGAAGAGCACACGTCTGAACTCCAGTCA"?
Hi,
My issue is similar to that in #14, but I am still a bit confused about this.
According to NEB, the sequences that I need to trim off are: Adaptor Read1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCA Adaptor Read2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
https://international.neb.com/faqs/2021/01/15/what-sequences-need-to-be-trimmed-for-nebnext-libraries-that-are-sequenced-on-an-illumina-instrument
However, based on the discussion in #14, it looks like these would not correspond to PrefixPE/1 and PrefixPE/2 as I thought. From my understanding, the sequences provided by NEB are those that are likely to contaminate Read 1 and Read 2, respectively, due to read through. How and why do these sequences need to be modified for use with Trimmomatic? Please could you supply the correct sequences to use.
Also, I see that in the TruSeq3-PE-2.fa file, you supply some additional sequences e.g. PE1 and PE1_rc - why are these added?
Many thanks, Lucy