smdabdoub / kraken-biom

Create BIOM-format tables (http://biom-format.org) from Kraken output (http://ccb.jhu.edu/software/kraken/, https://github.com/DerrickWood/kraken).
MIT License
47 stars 15 forks source link

Question Regarding Read Assignments in Biom Files #25

Open Tonny-zhou opened 1 year ago

Tonny-zhou commented 1 year ago

Dear author,

Firstly, I would like to express my appreciation for your excellent work on the kraken-biom software. It has been incredibly useful for my research.

However, I have come across a small issue that I'm hoping you can clarify. When I open a biom file (created by kraken-biom)in R, I have noticed that reads assigned to a genus are counted under "Number of reads assigned directly to this taxon". Meanwhile, reads assigned to species are counted under "Number of reads covered by the clade rooted at this taxon".

However, in the output file of the kreport2mpa.py script, all the reads are counted under "Number of reads covered by the clade rooted at this taxon". I'm wondering why there's a discrepancy here.

I would be grateful if you could help me understand the reason behind this difference.

Thank you very much in advance for your help.

Best regards, Tonnyz