Open klmr opened 7 years ago
This is due to the way the auto-detection algorithm works - it only recognizes FASTQ format once it has read the first four lines. Until then it will be operating in text mode, which means it will color any string of DNA that it finds.
I could fix this by reading the first four lines into a buffer and only outputting them once I know what the format is. However, that might lead to very high memory usage if the lines are very long. It would also mean that the first three lines would only be written to the screen once a fourth line has been read, which might cause weird behavior in some edge cases.
So I think for the moment it might be best to leave this as it is, but I'll keep the issue open for now. You can always avoid this by specifiying --format=fastq
!
In the following screenshot, produced by
gunzip -c file.fastq.gz | dnacol
, the first ID line contains highlighted fragments (theN
just before the end, as well as the barcode). The subsequent ID lines, by contrast, aren’t highlighted. Why is that?(dnacol 0.3.2)