Closed biotengwk closed 1 week ago
I think I have found it.
When running prodigal for GCF_002158865.1, -g 11 should be used. However, -g 4 were used by GTDBtk.
Hi Wenkai. Sorry for the very slow reply here. Yes, you are correct that the wrong translation table is being used. GTDB-Tk makes a "best guess" at the translation table if it is not explicitly provided by the user. We are actively looking into this with the hopes of making the "best guess" more robust.
Dear authors, @pchaumeil
Hi! I'm a post-doctor researcher from China and my name is Wenkai Teng. Just in these days I have tried to classify a group of genomes using GTDBtk (v2.4.0) with the latest reference database (GTDB release 220). However, a genome from GTDB r220 itself, with the ID RS_GCF_002158865.1, and classification as Comamonas_E serinivorans, could not be classified with the result:
'GCF_002158865.1 Unclassified Bacteria ... Insufficient number of amino acids in MSA (3.1%)'
My colleague tried this using the GTDBtk of another version and got the similar result. Could you please help me to check why that is?
Thanks in advance,
Wenkai Teng