Open paulzierep opened 9 months ago
The Bracken length should be read length, not the kmer length. When we originally wrote Bracken, we did a few different tests with just the kmer length but found that using a read length was more accurate.
I wont go into too much detail but based on the bayesian formula, the probability of kmers classified at a taxon is not the same as probability of reads classified at a taxon.
When building, you should specify both kmer length (of the built database) and the read length
So to be sure, the kmer length should be identical of the corresponding kmer length of the kraken DB?
The kmer length specified when building the database is the kmer length of the krakenDB (krakenUniq default = 31, kraken2 default = 35)
Should the kmer length of bracken always have the same length of the used kraken2 DB ? When does it make sense to use a different length? We would like to provide the user of bracken in Galaxy with better information maybe you can help @jenniferlu717 ? https://github.com/galaxyproject/tools-iuc/issues/5745