ropensci / taxa

taxonomic classes for R
https://docs.ropensci.org/taxa
Other
48 stars 12 forks source link

ENTREZ API key for NCBI database in extract_tax_data #135

Closed VGalata closed 6 years ago

VGalata commented 6 years ago

I use taxa::extract_tax_data to retrieve the taxonomy of BLAST search hits from the NCBI database. It worked great until last week. However, it stopped working for me saying that I need to provide an ENTREZ API key. Unfortunately, I could not find any matching parameter for an API key in taxa::extract_tax_data.

Code:

id_str <- 'gi|444304248|ref|NR_074674.1|'
regex_key = list(
    regex='^[a-z]+\\|([0-9]+)\\|[a-z]+\\|(.*)\\|',
    key=c('info', 'seq_id')
)
taxa::extract_tax_data(id_str, regex=regex_key$regex, key=regex_key$key, database='ncbi')

Output:

No ENTREZ API key provided
See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
Fehler: Bad Request (HTTP 400)

Used version taxa_0.2.0.9113

Thank you in advance!

sckott commented 6 years ago

thanks for your question @VGalata

Get an NCBI API key at xxx

Then store an environment variable with the name ENTREZ_KEY. Can do that in your current session like Sys.setenv(ENTREZ_KEY = "yourkey"), or permanently across all R sessions (the preferred way) by putting an entry like ENTREZ_KEY=yourkey in your .Renviron file or similar file like .bash_profile or .zshrc (not sure what the file is called on windows, see ?Startup for more info)

zachary-foster commented 6 years ago

Thanks for the report. It should work without a key as well, but I just checked and it is not for some reason. It might be we are making two back-to-back requests using different services without the 0.3 sec pause somewhere. Perhaps in the code I recently changed in taxize? I will look into it.

VGalata commented 6 years ago

@sckott Thank you very much for the advice! Settings the environment variable (in my case in the .bashrc file) solved the issue.

@zachary-foster Thanks for looking into it!