Ecogenomics / GTDBTk

GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
https://ecogenomics.github.io/GTDBTk/
GNU General Public License v3.0
470 stars 82 forks source link

Trying to interpret the output of GTDBTk #215

Closed shimingtan closed 4 years ago

shimingtan commented 4 years ago

Dear all,

I have managed to successfully run GTDBTk on my "high-quality" genomic bins as defined by CheckM, but am having some issues interpreting the results. I have attached the output file.

I was reading the original manuscript, but found it hard to understand the concept of RED. Given the values of RED in Column R, how can we draw interpretations from that column? How can we tell whether the genomic bin is "novel"? Is there a certain threshold to compare it with?

Thank you for reading this.

gtdbtk.bac120.summary.xlsx

donovan-h-parks commented 4 years ago

Hi. The RED values are used internally by GTDB-Tk to determine appropriate classifications. Generally speaking, it is not necessary for user to interpret this value. The taxonomic classification assigned by GTDB-Tk is given in column B ("classification"). This is the most resolved classification supported by the GTDB criteria. For example, bin.782 represents a novel Order in the class Krumholzibacteria according to GTDB-Tk. Depending on the goals of your study, additional analyses may be required to confirm this assignment. Namely, the inference of a de novo tree showing that the placement of this genome is well supported.

shimingtan commented 4 years ago

Dear @dparks1134 ,

Thank you for the reply. Sorry for asking such a noob question, but how do I go about doing an inference of a de novo tree? What software will you recommend?

donovan-h-parks commented 4 years ago

IQ-TREE is popular these days: http://www.iqtree.org/.

pchaumeil commented 4 years ago

Ticket closed.