Ecogenomics / GTDBNCBI

The GTDB provides the software infrastructure for working with a large collection of genomic resources. The major goal of this initiative is to provide a phylogenetically consistent taxonomy for archaea and bacteria.
https://gtdb.ecogenomic.org/
GNU General Public License v3.0
9 stars 2 forks source link

Automatically annotate new User genomes as Bacteria or Archaea #32

Closed donovan-h-parks closed 8 years ago

donovan-h-parks commented 8 years ago

It is often convienent to infer trees for just bacterial or archaeal genomes. This can be done with the taxa filter option of tree create. However, to ensure all genomes in a domain are considered genomes need to be automatically annotated as being a Bacteria or Archaea. This is possible by considering the number of genes identified in the domain-specific canonical alignments and assigning genomes to the domain with the highest percentage of identified genomes. Informal testing indicates that high-quality genomes have <20% of genes in the incorrect domain marker set.

pchaumeil commented 8 years ago

This feature has been added.

Check commit: 8082380ac2fe4071925f6c9cb1c37817fe342df5