Closed ucassee closed 5 years ago
We discuss RED in the GTDB manuscript: https://www.ncbi.nlm.nih.gov/pubmed/30148503
I find red value of some bins are approximately 0.35. And those bins are just annotated in phylum level (dBacteria;pMarinisomatota;c;o;f;g;s__) in gtdb. I want to know whether such a low red value can indicate these bins belong to a new phylum.
Following GTDB-Tk rules those bins are in the phylum Marinisomatota because their RED values (0.35) bring the RED value of p__Marinisomatota (0.385) closer to median phylum-level RED value ( 0.345).
You bins will become the most basal members of Marinisomatota.
To verify if your bins are part of a new phylum, You would need to generate a de novo bootstrapped tree and look at the support of the decorated nodes for the Marinisomatota branch.
@pchaumeil Thanks for your reply.
So do you mean if the RED values below the median phylum-level RED value ( 0.345), these bins are likely to be a new phlyum ?
I am not sure how should I run de novo bootstrapped tree
?
gtdbtk de_novo_wf --genome_dir Marinisomatota_dir --bac120_ms --outgroup_taxon p__Chloroflexota --taxa_filter p__Marinisomatota --out_dir de_novo_output
like this ?
So do you mean if the RED values below the median phylum-level RED value ( 0.345), these bins are likely to be a new phylum ? It will depends on the branch the bins are on. If your bins have a RED values of 0.33 and are placed on the parent branch of pMarinisomatota (0.385) . They will still be considered as pMarinisomatota because they are bringing the RED value of pMarinisomatota closer to median phylum-level RED value ( 0.345). But if your bins have a RED values of 0.33 and are placed on the parent branch of pPatescibacteria(0.341),They will be considered as a new phylum because, otherwise, they would bring the RED value of p__Patescibacteria farther to median phylum-level RED value ( 0.345).
GTDB-Tk doesn't generate bootstrapped trees. So, you will have to get the MSA generated from GTDB=Tk and generate the bootstrapped tree with your preferred phylogenetic tree construction software.
Hi @pchaumeil .
The gtdbtk.bac120.msa.fasta
file contains 23470 sequences. Should I use all of them to generate the bootstrapped tree? I guess it may take a long time to do it .
The following is the gdtbtk classification result:
bin.25 dBacteria;pMarinisomatota;c;o;f;g;s N/A N/A N/A N/A N/A N/A N/A N/A N/A dBacteria;;c;o;f;g;s Placement taxonomic novelty determined using RED N/A 72.94 11 0.341112511073 N/A bin.90 dBacteria;pMarinisomatota;c;o;f;g;s N/A N/A N/A N/A N/A N/A N/A N/A N/A dBacteria;;c;o;f;g;s Placement taxonomic novelty determined using RED N/A 94.4 11 0.339904346943 N/A bin.1 dBacteria;pMarinisomatota;c;o;f;g;s N/A N/A N/A N/A N/A N/A N/A N/A N/A dBacteria;;c;o;f;g;s Placement taxonomic novelty determined using RED N/A 94.74 11 0.338366749158 N/A bin.3 dBacteria;pMarinisomatota;c;o;f;g;s N/A N/A N/A N/A N/A N/A N/A N/A N/A dBacteria;;c;o;f;g;s Placement taxonomic novelty determined using RED N/A 83.27 11 0.336853438411 N/A
Do you mean the key point to determine whether they belong to new phylum is depend on whether they cluster with Marinisomatota on reference tree? If they are closer to other phlyum other than reference Marinisomatota on the tree, they are likely to be a new phylum.
For a potential new phylum, we would recommend using the full MSA. As a pre-screening, you could pick one representative per order ( or family) to create a bootstrapped tree.
To make sure genomes are part of a new phylum , We would create bootstrapped tree using different models and different sets of markers. You also need to take into consideration other characteristicsfor your bins like completeness,contamination,quality.....
Hi @pchaumeil.
I got it! Thanks for your patience!
No worries, good luck with your research!
Hi developer, I want to learn more about relative evolutionary divergence (RED) including how it is calculated and what it is based on? Could you please give me more information? Thanks in advance