saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
303 stars 49 forks source link

Possible missing keys in esearch response results #194

Open andrewdavidsmith opened 1 year ago

andrewdavidsmith commented 1 year ago

https://github.com/saketkc/pysradb/blob/0286ba9cf3d0f6e97b6d08baccf0bd36e11c650c/pysradb/sraweb.py#L340

I can't reproduce this, but I had a random event of:

File "/Users/aaaaaa/.venv/lib/python3.11/site-packages/pysradb/sraweb.py", line 339, in get_efetch_response
    n_records = int(esearch_response["esearchresult"]["count"])
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
KeyError: 'count'

line numbers are off by one from the current repo, so I might be slightly behind on this error. Rerunning immediately after worked fine.

dpbastedo commented 1 year ago

I have encountered a similar error when using pysradb metadata <accession list> --detailed > SRA_info.txt to query batches of multiple accessions at once:
` File "/home/[user]/.conda/envs/pySRAdb/lib/python3.11/site-packages/pysradb/sraweb.py", line 437, in sra_metadata exp_summary = exp_json["Summary"]


KeyError: 'Summary' `

In this case the error is caused by inclusion of SRR8866477 in the accession list, presumably because the record is not public. It would be preferable to return a nan instead of an error here so that it would still be possible to query in batches even if not all keys are found for each query accession in the batch.