Closed humbleflowers closed 2 months ago
This is an issue we have recognized with all tools in our benchmarking on taxonomic abundance. When you are using a database that consists of bacteria and viruses, all tools will recognize a bunch of bacterial reads as belonging to phages that infect the respected bacterial species. The indexed database has a much bigger impact on the results than the used tool. So in your case, it would make sense to use a bacteria-only database. I would also try to reduce the accepted error rate to 0.05 if your nanopore reads have a high quality, which could also resolve the issue.
Thank you @JensUweUlrich. It makes sense. I am using new R10.4 library data, i will try with reducing error rate.
Hello developers,
Thank you for the tool. I am benchmarking taxor version: 0.1.3 SeqAn version: 3.4.0-rc.1on ZYMO sample sequenced on ONT using prebuilt database containing Archaea, Bacteria, Fungii, Viruses.
I am surprised to see taxor predicted 38.59% Viruses in the sample The cami report file shows
is there any way i can fix this? According to ZYMO website, this is the expected proportions
Listeria monocytogenes - 89.1%, Pseudomonas aeruginosa - 8.9%, Bacillus subtilis - 0.89%, Saccharomyces cerevisiae - 0.89%, Escherichia coli - 0.089%, Salmonella enterica - 0.089%, Lactobacillus fermentum - 0.0089%, Enterococcus faecalis - 0.00089%, Cryptococcus neoformans - 0.00089%, and Staphylococcus aureus - 0.000089%.