mourisl / Lighter

Fast and memory-efficient sequencing error corrector
GNU General Public License v3.0
92 stars 17 forks source link

invalid fastq file not being detected #10

Open vcepeda opened 9 years ago

vcepeda commented 9 years ago

It seems like Lighter just checks if the first character of the file is '@' to consider it valid. For example, this input file (test.fq) was considered correct: @IRIS:7:1:17:394#0/1 GTCAGGACAAGAAAGACAANTCCAATTNACATTATG IRIS:7:1:17:394#0/1 aaabaa]baaaaa_aab]D^^baYDW]abaa^ IRIS:7:1:17:800#0/1 GGAAACACTACTTAGGCTTATAAGATCNGGTTGCGG IRIS:7:1:17:800#0/1 ababbaaabaaaaa]ba]aaaaYD_aXT IRIS:7:1:17:1757#0/1 TTTTCTCGACGATTTCCACTCCTGGTCNACGAATCC IRIS:7:1:17:1757#0/1 aaaaaaaaa`aaaa_^a```]][Z[DY^XYV^_Y ksdsada dada e e 5

output: ./lighter -r test.fq -k 17 5000000 0.1 -t 10 [2014-12-08 11:35:35] =============Start==================== [2014-12-08 11:35:35] Bad quality threshold is "T" [2014-12-08 11:35:36] Finish sampling kmers [2014-12-08 11:35:36] Bloom filter A's error rate: 0.000000 [2014-12-08 11:35:36] Finish storing trusted kmers [2014-12-08 11:35:36] Finish error correction Processed 5 reads: 0 are error-free Corrected 0 bases(0.000000 corrections for reads with errors) Trimmed 0 reads with average trimmed bases 0.000000 Discard 0 reads