nf-core / taxprofiler

Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data
https://nf-co.re/taxprofiler
MIT License
116 stars 33 forks source link

Classifier: metaxa2 #139

Open vmikk opened 1 year ago

vmikk commented 1 year ago

Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data

https://microbiology.se/software/metaxa2/ doi:10.1111/1755-0998.12399

jfy133 commented 1 year ago

What do you think @Midnighter @sofstam?

It does take shotgun reads, but is pulling out 16S related hits. I guess sort of similar to mOTUs or metaPhLAn which uses specific genes, as pointed out by @vmikk

sofstam commented 1 year ago

Shall this be added as an issue to taxpasta as well?

jfy133 commented 1 year ago

Depends if we think it's worth including in the pipeline.

the only thing with mOTUs and MetaPhlAn3 is they go for multiple genes (AFAIK), i.e., spanning more of the genome.

I don't know what we want to set the limit for. I personally only would be interested really in classifiers that work on whole genomes.

sofstam commented 1 year ago

We have the same scope, interested in classifiers on whole genomes.

Midnighter commented 1 year ago

I've only had a very brief look, please correct me/add info @vmikk

Metaxa 2 is similar to other tools in this pipelines because:

Metaxa2 is dissimilar because:

The big question to me (which I couldn't gather from the paper abstract), does it use a reference database to assign taxonomy or some other concept like ASVs? If it uses a reference database maybe it fits?

vmikk commented 1 year ago

@Midnighter That's correct, Metaxa2 uses rRNA only. It identifies rRNA in metagenomic shotgun data using HMM-profiles (could be considered as a database, but it should be more sensitive, as it could potentially identify novel/previously-unobserved sequences matching the profile). There multiple profiles targeting Bacteria, Archaea, Eukaryota, mitochondria, and chloroplast sequences.

I'm also in doubt if the program fits taxprofiler pipeline. But, in some sense, it also profiles the community.

Midnighter commented 1 year ago

What does metaxa2 build the HMM profiles from at the moment and can that be customized?

vmikk commented 1 year ago

As far as I know, authors build HMMs from the curated sequence alignments (SSU and LSU, independent for different tax groups). It is very similar to ITSx tool (used in ampliseq). Not sure that alignments are currently available online. But the creation of own HMMs is certainly possible. Authors are very responsive and could update the profiles if a profile is not able to detect some sequences.

jfy133 commented 1 year ago

We can talk about this in the mini session tomorrow morning or the longer session on Thursday.

@vmikk if you want to join our development meetings (to help you reserve times e.g. for finishin the sourmash things, and get more rapid feedback etc.), we currently meet:

On the nf-core gather.town office (not the hackathon one!): https://nf-co.re/join#gather-town

jfy133 commented 1 year ago

@vmikk I think as we include MetaPhlAn/mOTUs which also align (only) to genes, it would be fine to include metaxa2 in taxprofiler as it uses shotgun input.

However I would say it's probably low-priority for the current main devs, but you would like it in, you're welcome to contribute it to hte pipleine