Closed schmeing closed 8 years ago
I'll download that data set and take a look at it.
Thanks for letting me know.
I think I've fixed the bug. Can you pull the new version and give it a try?
Thanks.
Works fine for me now. Awesome how fast you fixed it. Thanks
Hi,
I tried to correct the publicly available SRR001665 dataset with lighter using the following command: nice -10 lighter -r ../SRR001665_1.fastq.gz -r ../SRR001665_2.fastq.gz -k 13 4600000 0.04 -t 64 -od k13/ 2>&1 | tee k13/lighter.log
The correction runs through without problems, but the resulting fastq files have 25 respectively 42 unintentionally trimmed sequences in them like this one: @SRR001665.72513 071112_SLXA-EAS1_s_4:1:6:808:233 length=36 cor badprefix=7 ak GCGTGCCGAAGTTAGTGGGCCTGGAGAATC + IIIIIIIIIIIIIIIIII3?I/%.IIII_IIC4I' There are still all 36 quality scores, but the last in this case 6 bases of the sequence have been trimmed.
The output is: [2016-03-22 16:44:22] =============Start==================== [2016-03-22 16:44:24] Bad quality threshold is "&" [2016-03-22 16:45:33] Finish sampling kmers [2016-03-22 16:45:33] Bloom filter A's false positive rate: 0.001899 [2016-03-22 16:47:13] Finish storing trusted kmers [2016-03-22 16:52:13] Finish error correction Processed 20816448 reads: 18328409 are error-free Corrected 3617298 bases(1.453875 corrections for reads with errors) Trimmed 0 reads with average trimmed bases 0.000000 Discard 0 reads