OpenOmics / metamorph

Metagenomics and Metatranscriptomics pipeline
https://openomics.github.io/metamorph/
MIT License
4 stars 1 forks source link

CAT (contig anotation tool) not classifying correctly #21

Closed rroutsong closed 1 month ago

rroutsong commented 2 months ago

Some issue currently exists that prevents CAT from accurately classifying bins into taxonomies. Output looks like such:

# bin   classification  reason  lineage lineage scores (f: 0.30)        superkingdom    phylum  class   order   family  genus   species
bin.1.fa        no taxid assigned       hits not found in taxonomy files
bin.10.fa       no taxid assigned       hits not found in taxonomy files
bin.11.fa       no taxid assigned       hits not found in taxonomy files
bin.12.fa       no taxid assigned       hits not found in taxonomy files
bin.14.fa       no taxid assigned       hits not found in taxonomy files
bin.15.fa       no taxid assigned       hits not found in taxonomy files
bin.16.fa       no taxid assigned       hits not found in taxonomy files
bin.17.fa       no taxid assigned       hits not found in taxonomy files
bin.19.fa       no taxid assigned       hits not found in taxonomy files

Using the NCBI taxonomic databases produces the above results, an error is thrown when using CAT < 6.0.1 databases (similar to https://github.com/MGXlab/CAT_pack/issues/115). Currently a new v6.0.1 database is being created and will be tested on the NIDDK-5 dataset.

rroutsong commented 1 month ago

26 addressed here