donovan-h-parks / PhyloRank

Assign taxonomic ranks based on evolutionary divergence.
GNU General Public License v3.0
22 stars 4 forks source link

Less taxa than expected #18

Open agavriilidou opened 3 years ago

agavriilidou commented 3 years ago

Hi,

I am trying to assign taxonomic ranks to novel MAGs that based on tree topology seem to represent several new species and 1 new family within the same phylum. To decorate the tree, I included genomes from 3 additional phyla. When I run PhyloRank it identifies only 3 phyla (not the one I am interested in) and in the output I get less taxa than expected. For example, I have 462 species and in the decorated.html file I get 6. Any ideas why? Thanks in advance, MG

donovan-h-parks commented 3 years ago

Hi. Would GTDB-Tk be more suited to your purposes? It is designed to classify genomes within the context of the GTDB taxonomic framework.

https://github.com/Ecogenomics/GTDBTk

agavriilidou commented 3 years ago

I used GTDB-Tk's classify workflow and GTDB classification of these MAGs showed they belong to a new family (same as tree topology). However, I calculated AAI% and 16S ID% and they were conflicting with the above. For this reason, I thought RED values will support better my findings. I read that PhyloRank is mostly used for the manual curation of GTDB but I have seen a couple of research articles where PhyloRank was used to resolve the phylogeny.

donovan-h-parks commented 3 years ago

RED values are only loosely correlated with AAI% and 16S ID%. RED aims to account for organisms evolving at different rates which is not accounted for by AAI% and 16S ID%. GTDB-Tk classifications are determined based on the placement of your genomes in the GTDB reference tree and RED.

That said, GTDB-Tk can't establish the relationship between your MAGs (i.e. do all of them belong to one species or multiple species, one genus or multiple genera?). PhyloRank is a reasonable approach for trying to resolve such questions.

I think you'd need to email me your data for me to appreciate what is happening in terms of the PhyloRank output.

agavriilidou commented 3 years ago

I sent you my data. Thanks a lot!