nf-core / taxprofiler

Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data
https://nf-co.re/taxprofiler
MIT License
104 stars 32 forks source link

Add CAT/BAT/RAT for long reads #504

Open LilyAnderssonLee opened 2 weeks ago

LilyAnderssonLee commented 2 weeks ago

Description of feature

Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs / bins) of both known and (highly) unknown microorganisms, as generated by contemporary metagenomics studies. The core algorithm of both programs involves gene calling, mapping of predicted ORFs against a protein database, and voting-based classification of the entire contig / MAG based on classification of the individual ORFs. CAT and BAT can be run from intermediate steps if files are formated appropriately.

Read Annotation Tool (RAT) estimates the taxonomic composition of metagenomes using CAT and BAT output.

Suggested in the slack channel https://nfcore.slack.com/archives/C031QH57DSS/p1719128116043699

Publication of this tool: https://www.nature.com/articles/s41467-024-47155-1

github: https://github.com/MGXlab/CAT_pack