krassowski / easy-entrez

Retrieve PubMed articles, text-mining annotations, or molecular data from >35 Entrez databases via easy to use Python package - built on top of Entrez E-utilities API.
https://easy-entrez.readthedocs.io/en/latest/
GNU Lesser General Public License v3.0
69 stars 6 forks source link

problem of (ReadTimeout ) API time out with easy_entrez #1

Closed ahmedibatta closed 3 years ago

ahmedibatta commented 3 years ago

I used easy-entrez to get the name of the genes from the SNP ID, I have a large dataset of 7 Million SNP. I just tried with 4000 in ( for loop for just 1000 in one time ) and it gave me an error in the last loop.

HTTPSConnectionPool(host='eutils.ncbi.nlm.nih.gov', port=443): Read timed out. (read timeout=10) So How can solve this problem?

krassowski commented 3 years ago

You need to increase the timeout, as the default 10 seconds it apparently is too short for such a number of SNPs. To do so you just pass a timeout argument to EntrezAPI, like so:

from easy_entrez import EntrezAPI

entrez_api = EntrezAPI(
    'your-tool-name',
    'e@mail.com',
    timeout=10 * 60,    # 10 minutes
    # other arguments here
)

which is documented in the documentation: https://easy-entrez.readthedocs.io/en/latest/usage.html#easy_entrez.api.EntrezAPI

krassowski commented 3 years ago

I will close this one as answered, but please feel free to ask any follow-up questions as new issues. I hope you enjoy this package :)