DMU-lilab / pTrimmer

Used to trim off the primer sequence from mutiplex amplicon sequencing
GNU General Public License v3.0
20 stars 5 forks source link

pTrimmer seems to miss primers badly in some cases. #23

Closed yangjw1996 closed 5 months ago

yangjw1996 commented 1 year ago

Hi, xiaolong. I have applied pTrimmer in my current workflow of amplicon sequencing. It is very fast compared with other tools and fit amplicon sequencing well.

After a period of use, I have found pTrimmer sometimes misses primers badly. But if I change the 'Insert Size' or extend the primer a little bit,then pTrimmer works correctly.

Here is an example to explain the problem more cleanly. I paste the data in the end. In this example, pTrimmer somehow can't recognize the reverse primer even I changed the parameter 'K' for several times. But if I change the 'Insert Size' from 130 to 132, pTrimmer then successfully recognize the revise primer.

I have tried pTrimmer on hundreds of primers. The above problem occurs from time to time, both in PE and SE data. Although I can fix it by change the 'Insert Size' or extend the primer, I can not understand the reason or see the rules.

Could you share you ideas on the above issue? Thanks a lot.


pTrimmer version <1.3.4> should be the newest

fastq-read1 < @A00199:698:H5C5CDSX5:1:1341:8440:19179 1:N:0:CCGCGGTT+CTAGCGCT CACTTTCCACACAGACATCATAACAAAAAATTTCCACCAAACCCCCCCCCCCCCCCCCCGCGGGGCCCACCCCCATAAAACAGAGCAGCCCGACCCCCAAAAAAAAAAAAACCAAAAACACCCCAACCCACCATCAAATAGTGTTGTTTT + FFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF,F:,,:FFFFF,,:F:FF:F,,,,,,,,,,,,:,F,,,,FF,,,,,,,,,,,,,,,,:,:,,FF,,:F,:,,,,,,,,,,,,,::,,,,,,,,,,,,,F,,,,,,,,,:,,F @A00199:698:H5C5CDSX5:1:1342:13783:25645 1:N:0:CCGCGGTT+CTAGCGCT CACTTTCCACACAGACATCATAACAAAAAATTTCCACCAAACCCCCCCCCCCCCCCCCCACGGGGCCCCACCCAAAAAAAAAAAAAAAACAAAAACCCAAAAAAAAAAAAACAAAAAAAAAACAAAAAAAAAAAAAAAAAATATAATTTT + FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFF:F:F,:F,FF,F::FFFF,,F:FFF:F,,,,,,,F,:,,,,,,,,F,:,F::,:,,,,,,,,,,,,,,,FF,F:F,FFFF,,,:F,:,F,,,,,:,,,,:,:,,:F:F,:,F,,,,,,F @A00199:698:H5C5CDSX5:1:1342:29866:9627 1:N:0:CCGCGGTT+CTAGCGCT CACTTTCCACACAGACATCATAACAAAAAATTTCCACCAAACGCCCCCCGGCCCCCCCCGGGGGGGCCACCCCCGGAACAAAGAACAGGCCCAACCCCAAAAAAAAGAAAACCAAAAACCCCCACAACACAGGGAGACAGTTTGAGTTTT + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF,:,,,,:FFF:,,F:F:FF:,:,,,:,,,,,,,,F,,,,,,,,,,,,,,,,,,,,,,,F:,,F:,:,F,,,,,F,,,,,,::,::,,,,,,,,,,,,,:,,,,:,,,,,,,, @A00199:698:H5C5CDSX5:1:1342:3531:16783 1:N:0:CCGCGGTT+CTAGCGCT CACTTTCCACACAGACATCATAACAAAAAATTTCCACCAAACCCCCCCCGCCCCCCCCCGCGGGCCCCCCCCCACAAAAAACACAAAAGCACAAACCACAAAAAAAAAAACCCAACAACCCCCACACCCCCAATTAAAAAATAAAAATTT + FFFFFFFFFFFFFF:FFFFFFF:FFFF,FFF:FFF:FFF,,:,F,:F:F,,:FFFF,,,,,,,,,,::,,,,,,,:,:,,,,:,,,,,,,:,:,::F,,,,,:,,F:F:F:,,,F,F:F,F,:,,FF,,:F:F,,,:F,,,,,,,,,,,, @A00199:698:H5C5CDSX5:1:1342:8015:27618 1:N:0:CCGCGGTT+CGAGCGCT CACTTTCCACACAGACATCATAACAAAAAATTTCCACCAAACCCCCCCCCCCCCCCCCCGGGGGGCACCGCCCCAGAACCAAAAAAAGGACAAAACCCCAAAAAAAAAAACCCCAAAACCCACCGAAACCAGATAAAAATTTATGTTTTT + FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF,FF:F,,,::FFFF,,,FFFF:F,,,,,:,,,,,,,,:,,,,,,,,:,,:,,,,,,,,,,:F:,,,,,F:F,,:,,,,,:F:,,,,,F,,,:,,,,,,,::,F:,,,,F,,,,,:

fastq - read2 < @A00199:698:H5C5CDSX5:1:1341:8440:19179 2:N:0:CCGCGGTT+CTAGCGCT GGGGTGACTGTTAAAAGTGCATACCGCCAAAAGATAAAATTTGAAATCTGGTTAGGCTGGTGTTAGGGTTCTTTGTTTTTGGGGTTTGGCAGAGATGTGTTTAAGTGCTGTGGCCAGAAGCGGGGGGAGGGGGGGGTTTGGTGGAAATTT + FFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFF:,F:FFFFFFFFFFFFFF:FFFFFF:FFFF,FFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::F,::F:FFFFFFFF @A00199:698:H5C5CDSX5:1:1342:13783:25645 2:N:0:CCGCGGTT+CTAGCGCT GGGGTGACTGTTAAAAGTGCATACCGCCAAAAGATAAAATTTGAAATCTGGTTAGGCTGGTGTTAGGGTTCTTTGTTTTTGGGGTTTGGCAGAGATGTGTTTAAGTGCTGTGGCCAGAAGCGGGGGGAGGGGGGGGTTTGGTGGAAATTT + FFF:FFF,FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF::FFFFF,:FFFFF,FF:F,FF:F:,FFFFFFFFF:F:FFF:FF:FFFFFFFFFFF:F:FFFFFFFFFFFFFF:F,F,FFFFFFFF: @A00199:698:H5C5CDSX5:1:1342:29866:9627 2:N:0:CCGCGGTT+CTAGCGCT AGGGTGACTGTTAAAAGTGCATACCGCCAAAAGATAAAATTTGAAATCTGGTTAGGCTGGTGTTAGGGTTCTTTGTTTTTGGGGTTTGGCAGAGATGTGTTTAAGTGCTGTGGCCAGAAGCGGGGGGAGGGGGGGGTTTGGTGGAAATTT + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFFFFFFFFFFFFFFFFFF:FFFFFFFF:FFFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFFF:FFFFFFF,FFFF @A00199:698:H5C5CDSX5:1:1342:3531:16783 2:N:0:CCGCGGTT+CTAGCGCT GGGGTGACTGTTAAAAGTGCATACCGCCAAAAGATAAAATTTGAAATCTGGTTAGGCTGGTGTTAGGGTTCTTTGTTTTTGGGGTTTGGCAGAGATGTGTTTAAGTGCTGTGGCCAGAAGCGGGGGGAGGGGGGGGTTTGGTGGAAATTT + FF::FFFFFFFFFFFFFF:FFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF:FFFFFFFFFF: @A00199:698:H5C5CDSX5:1:1342:8015:27618 2:N:0:CCGCGGTT+CGAGCGCT GGGGTGACTGTTAAAAGTGCATACCGCCAAAAGATAAAATTTGAAATCTGGTTAGGCTGGTGTTAGGGTTCTTTGTTTTTGGGGTTTGGCAGAGATGTGTTTAAGTGCTGTGGCCAGAAGCGGGGGGAGGGGGGGGTTTGGTGGAAATTT + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFF:FFFFFF:FFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFF:F,FFFF

primers to trim < CACTTTCCACACAGACATCATAAC GGGGTGACTGTTAAAAGTGCA 130 H1-8

XLZH commented 1 year ago

@yangjw1996 There is a similar answer that you can refer to (https://github.com/DMU-lilab/pTrimmer/issues/22#issuecomment-1375673529).

I checked the READs you posted here and guess that your amplicon is in the condition of 'normal-condition (refer to README)'. Therefore, if you increase the 'Insert-size', pTrimmer will remove the forward primer correctly and ignore checking the reverse complementarity of reverse primers (because the READ is not in the 'read-through' condition).

In a few words, I suggest using the Insert-size given by the primer design software, of course, If you are confident that your READ is not read-through, then you can set it to a large value (eg. 512, which greater than read length 150). Conversely, if you are confident that your READ is read-through, then you can set it to a very small value (eg. 5).

yangjw1996 commented 1 year ago

Thank you for exlpaining clearly!