Closed donutbrew closed 4 years ago
Hi @donutbrew
I am sorry for the trouble caused by pTrimmer!
Could you please provide some fastq reads (and the corresponding primer file) to reproduce the problem? Thanks ~
Best wishes Xiaolong
Primer file: https://gist.github.com/donutbrew/41d1acce6876cf6cf15382f959d7f1ad
Truncated reads (I get the same result with the full-length files): https://gist.github.com/donutbrew/311f25df80fcf059146cf3b02610b591
I built it with gcc/9.2.0 if that turns out to be important...
I find the read length of your fastq file is 301-bp, which result in the core dump!
The easiest way is to modify 'fastq.h/FQLINE' value from 256 to 512, and recompile the code.
However, after testing your fastq and primer file, pTrimmer gives a very low reads-trimming ratio (6.5%):
$ ./pTrimmer-1.3.3 -t pair -a CDC_SC2_200710.txt -f sample1_R1.fastq -d Trim_R1.fq -r sample1_R2.fastq -e Trim_R2.fq -q 25
[*] Processing the [thread: 1] ...
[*] Processing the [thread: 2] ...
Total time consume: 0.0(s)
----------------- Summary ------------------------
Total reads processed: 200
Reads have bad primer: 180
Reads have bad quality: 7
Reads successfully trimed and have good quality: 6.50 %
Then, I checked your fastq reads and the corresponding primers, and found most of your reads are not starting with primer sequence (like follows)!
@M04500:102:000000000-J6F5V:1:1101:19169:2004 1:N:0:1
TACTACCACACAACTGATCCTAGTTTTCTGGGTAGGTACATGTCAGCATTAAATCACACTAAAAAGTGGAAATACCCA ... (your read)
TGGCTACTACCGAAGAGCTACC (primer sequence)
@M04500:102:000000000-J6F5V:1:1101:10091:1954 1:N:0:1
GTAGTGGAAAATCCTACCATACAGAAAGACGTTCTTGAGTGTAACTGTCTCTTATACACATCTCCGAGCCCACGAG ... (your read)
CTGAAGAAGTAGTGGAAAATCCTACCA (primer sequence)
As we known, the reads we get from target/amplicon sequencing are always start from the first base of primer sequence (like follows):
@M03970:332:000000000-J2CK5:1:1101:16151:2887 1:N:0:15
GTCCAGCTTTGTGCCAGGAGCCTCGCAGGGGTTGATGGGATTGGGGTTTTCCCCTCCCATGTGCTCAAGACTGGCGCTAAAAGTTTTGAGCTTCTCAAAAGTCTAGA ... (read)
GTCCAGCTTTGTGCCAGGAG (primer sequence)
@M03970:332:000000000-J2CK5:1:1101:21721:3033 2:N:0:15
AGCCCGAACGCAAAGTGTCCCCGGAGCCCAGCAGCTACCTGCTCCCTGGACGGTGGCTCTAGACTTTTGAGAAGCTCAAAACTTTTAGCGCCAGTCTTGAGCACATG ... (read)
AGCCCGAACGCAAAGTGT (primer sequence)
Therefore, I think your reads have the following possibilities:
(1) your sequencing strategy is not target/amplicon sequencing
(2) part of the primer sequence are mis-trimmed at the begining of your read
I suggest you check your read and the corresponding primer to find why the start of your read is different from the start of your primer sequence.
@donutbrew If you still have questions about the use of pTrimmer, please feel free to contact me.
@XLZH Thanks for the solution. I appreciate the good comments in the source--too bad I didn't read them!
And yeah, I passed you reads from amplicons that had been fragmented prior to library prep, so what you saw makes sense.
Hi - after building pTrimmer, I can't seem to get it to work--it drops with a Segmentation Fault right away.
Here's my command, using fastq files generated by a MiSeq:
I've also tried with non-gzipped fastqs, but I get the same result. I'm sorry I can't provide too many details, but this is all I have. Let me know how I can help.