Open tseemann opened 8 years ago
I tested again on my data sets and could not trigger the bug you met. Is there a way for me to access the data set you use? If not, can you show me the summary of correction on screen output by Lighter? Thanks.
I found the issue. If you compile with default -O2
option it works. In Linuxbrew, I used the system CXXFLAGS
which sets -Os
(size optimize), which causes the bug!
CC: @sjackman
See the output messages below:
Files R1.fq.gz
Reads 3747457
This is g++ -O2
(which works)
$ ./lighter-1.1.1-O2 -od 1.1.1-O2 -r R1.fq.gz -r R2.fq.gz -K 32 4000000 -t 72 -maxcor 2
[2016-08-17 00:11:57] =============Start====================
[2016-08-17 00:11:57] Scanning the input files to infer alpha(sampling rate)
[2016-08-17 00:12:04] Average coverage is 141.346 and alpha is 0.050
[2016-08-17 00:12:05] Bad quality threshold is "B"
[2016-08-17 00:12:15] Finish sampling kmers
[2016-08-17 00:12:15] Bloom filter A's false positive rate: 0.006326
[2016-08-17 00:12:24] Finish storing trusted kmers
[2016-08-17 00:12:56] Finish error correction
Processed 7494914 reads:
7042749 are error-free
Corrected 579197 bases(1.280942 corrections for reads with errors)
Trimmed 0 reads with average trimmed bases 0.000000
Discard 0 reads
This is g++ -Os
with missing reads!
$ ./lighter-1.1.1-Os -od 1.1.1-Os -r R1.fq.gz -r R2.fq.gz -K 32 4000000 -t 72 -maxcor 2
[2016-08-17 00:13:38] =============Start====================
[2016-08-17 00:13:38] Scanning the input files to infer alpha(sampling rate)
[2016-08-17 00:13:46] Average coverage is 141.346 and alpha is 0.050
[2016-08-17 00:13:47] Bad quality threshold is "B"
[2016-08-17 00:13:57] Finish sampling kmers
[2016-08-17 00:13:57] Bloom filter A's false positive rate: 0.006326
[2016-08-17 00:14:06] Finish storing trusted kmers
[2016-08-17 00:14:32] Finish error correction
Processed 5022995 reads:
4719925 are error-free
Corrected 388478 bases(1.281809 corrections for reads with errors)
Trimmed 0 reads with average trimmed bases 0.000000
Discard 0 reads
Ping @mourisl - any ideas?
As a workaround you can use ENV.O2
in the formula to use -O2
rather than the default -Os
.
Today I upgraded from lighter 1.0.7 to 1.1.1 and I first noticed a problem when 1.1.1 was outputting different number of reads in the two output files, and then noticed it was also passing far fewer reads.
This is the command line:
This is the difference in read counts:
Any ideas?