Closed nikostr closed 5 months ago
Hi, thanks! I was able to reproduce this (this is good). I take a look at individual_100 and I think the reason is that input fastq files are broken. KMC detects some issues with fastq files, but definetly not all of them.
Here is an example individual_100_R1.fastq
file is broken:
So there is no read for this header @individual_100.426381 426381 length=151
at line 899 (this is also wired that the same header occurs multiple times, but it shouldn't matter). Let me know if it helps.
Thank you for looking closer at this, and sorry for not verifying the fastq-files prior to submitting this issue! This helped a ton!
That's no problem; I'm just glad it's not a KMC bug; that's a relief. Thanks again for using KMC :)
Providing the
-t
parameter leads to not all reads being processed. I'm running the following script:There is one
input_files.txt
per individual, each pointing to a pair of reads available here: https://github.com/akcorut/kGWASflow/tree/main/.test/data/test_readsWhen I specify
-t2
the total number of reads reported in the log file is sometimes lower than when I do not specify a value for-t
. Is this expected behavior?