Open ericcrandall opened 3 years ago
Or if not replace, at least add these to it.
Or, thinking further, we won't be pushing taxonomy back to NCBI at all, but we should still include their phyla in our default controlled vocabulary.
Attached is a comparison of phyla between GEOME and NCBI.... there is not as much agreement between the two as i would like to see! Several options:
Chris probably knows more than I do, but it seems like Catalog of Life is trying to reconcile ITIS and GBIF and may be the best authority. What did Biocode use as the source of phyla originally?
But if we are pushing data to NCBI then we really should include their taxonomy. For example in the datathon, since we were adding metadata retrospectively to SRA projects, we queried NCBI taxonomy. So I would favor option 1. In theory, the phyla could be reconciled later, right?
Eric
Yikes, next time I'll reply on GitHub
As we are pushing metadata to NCBI now, it would make sense that the default controlled vocabulary contain phyla from the NCBI taxonomy, which can be obtained with the R taxize command and is attached.
library(taxize) ncbi_phyla <- downstream(sci_id = "cellular organisms", db = "ncbi", downto = "phylum", intermediate = F)
NCBI_Phyla.csv