fhcrc / taxtastic

Create and maintain phylogenetic "reference packages" of biological sequences.
GNU General Public License v3.0
21 stars 10 forks source link

adding new names #74

Closed aubreymoore closed 6 years ago

aubreymoore commented 8 years ago

I am using taxtastic/NCBI taxonomy to facilitate assembly of a biodiversity checklist. I need to add new nodes (taxa) and new names (scientific names, synonyms, common names, misspellings, etc.) to my local database. Adding to the nodes table is straight forward because the PK for this table is VARCHAR. This allows adding local records as per the docs: "For instance, if we want to add a new species Lactobacillus borisii, to which we have assigned the arbitrary tax_id AA1 (the NCBI taxonomy doesn’t use letters in its tax_ids, so a safe way to avoid collisions is to choose a couple letters as a prefix for your own additions)"

However, adding records to the names table is problematic because the PK for this table is an integer. Any chance this could be changed to a VARCHAR to avoid collisions when the local database is updated with new NCBI data?

nhoffman commented 8 years ago

Hi @aubreymoore - sorry for the delayed response. The problem with this request is that names.id is auto-incremented as the names table is filled in. Unlike nodes.tax_id, names.id is an internal index - it's never referred to outside of the database, and shouldn't need to be provided when new names are created. Also unlike tax_ids, there's no constraint that taxonomic names are unique either within or between tax_ids, so preventing collisions of names between tax_ids isn't really an objective in the names table. Hope this helps.

nhoffman commented 6 years ago

Closing this. Please reopen if you want to discuss further.