CSCfi / metadata-submitter

Metadata Submission Interface for SDA
https://metadata-submitter.rtfd.io
MIT License
3 stars 2 forks source link

create autocomplete API for NCBI taxonomy browser #281

Open blankdots opened 2 years ago

blankdots commented 2 years ago

Description

Issue /CSCfi/metadata-submitter-frontend/issues/418 requires an API to call, however NCBI does not provide an easy way to query their API and for this we need to construct our own and create a mechanism to keep it up to date

The API does not seem to have an XML or JSON response so we might need to set up our own data from the ftp: https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/new_taxdump/

(information here: https://ncbiinsights.ncbi.nlm.nih.gov/2018/02/22/new-taxonomy-files-available-with-lineage-type-and-host-information/ )

DoD (Definition of Done)

Autocomplete API to search by ID or taxon name is available for the front-end

Testing

Unit and integration testing

blankdots commented 2 years ago

examples on how to build:

csc-felipe commented 2 years ago

Samples have a taxonomy ID, but the ID is not human-readable, so this would give scientists the human-readable name for the taxonomy ID.

csc-felipe commented 2 years ago

For EGA, only human taxonomy items are necessary, and they could be included with the code, but others need to be searchable.

blankdots commented 2 years ago

we fix it for CSV to 9606 https://github.com/CSCfi/metadata-submitter/blob/develop/metadata_backend/helpers/parser.py#L466-L470