Closed andressamv closed 1 year ago
"Candidatus Eremiobacterota" seems to be a legit NCBI taxonomic classification with an NCBI taxid: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=1154676&lvl=3&lin=f&keep=1&srchmode=1&unlock
What is the problem with using "Candidatus Eremiobacterota"?
Thank you for your response! NCBI requires the use of a taxonomic name at the lowest rank that is reliable. I don't think it is a serious problem, so I am just trying to understand what to choose here.
I see: g_Aquilonibacter
mapped to Candidatus Eremiobacterota
You can alter the mapping threshold to make the assignment more permissive, and then you should get a more fine-grained NCBI taxonomic classification
Hi! Thank you for this amazing tool! I am using ncbi-gtdb_map.py for the first time, and everything worked perfectly. However, I have a conceptual question. In some cases, the script results in a NCBI taxonomy that I didn't expect. For example:
GTDB: d_Bacteria; p_Eremiobacterota; c_Eremiobacteria; o_Baltobacterales; f_Baltobacteraceae; g_Aquilonibacter
I expected the NCBI taxonomy to be Candidatus Eremiobacteria since this class is on NCBI. Instead, the script returns Candidatus Eremiobacterota (phylum). I understand that is related to the provided GTDB metadata. But how should I proceed when submitting my genomes to NCBI? What would be the problems of using Candidatus Eremiobacteria, for example?
I am having a hard time comparing GTDB-NCBI, so I appreciate any feedback on this.