DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
683 stars 266 forks source link

Number of reads in .kraken2.output.txt, _1.fq and _unclassfied_1.fq inconsistent with reads in input fastq file #834

Closed MAlinkous closed 1 month ago

MAlinkous commented 1 month ago

I find that when using different databases (human databases in 20,25,30,35 kmers), the number of reads in .kraken2.output.txt, _1.fq and _unclassfied_1.fq is inconsistent with reads in input fastq file, there are 281194596 reads in my original fastq file, but the numbers of reads in results from databases of kmer 20,25 and 35 are much lower than 281194596 (less than 100,000,000).

Morever, when I try to change the folder of output, the file size and number of reads changed with no reason, the parameters are completely the same.

I wonder if it is a Bug or something else, hope to get some answer, Thanks

tibitoy commented 1 month ago

Is the kraken output supposed to give you the number of reads mapped? I thought it gave the number of fragments.

MAlinkous commented 1 month ago

Is the kraken output supposed to give you the number of reads mapped? I thought it gave the number of fragments.

there are many ways to count the number of reads mapped, as far as i know

  1. the num is directly reported in .kraken2.report.txt
  2. count the number of reads in _1.fq and _unclassfied_1.fq
  3. count line num in .kraken2.output.txt
MAlinkous commented 1 month ago

The reason is that space in full on HPC.