jenniferlu717 / Bracken

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
http://ccb.jhu.edu/software/bracken/index.shtml
GNU General Public License v3.0
286 stars 50 forks source link

Classification issues with similar taxonomic ids #209

Open jkimsis opened 1 year ago

jkimsis commented 1 year ago

Hello,

I've found in several different runs that for some samples Bracken outputs the same amount of reads for Homo sapiens (id 9606) and Bacteriovorax stolpii (id 960), and in rarer cases Buchnera aphidicola (id 9). The similarities of the taxonomic ids leads me to believe that there is an issue with id matching in the code. The similarity doesn't exist in the Kraken .txt reports, nor in the *_bracken_species.txt files, i.e., there's only the human reads.

I can send the Kraken report .txt and Bracken tsv files if it helps, but don't want to post them on public forum.

jkimsis commented 1 year ago

Found the same issue with Delftia tsuruhatensis (id 180282) and Mycolicibacterium farcinogenes (id 1802)

jenniferlu717 commented 1 year ago

That is strange. Can you email me at jennifer.lu717@gmail.com