Open Maarten-vd-Sande opened 4 years ago
Thanks for creating a reproducible example. My guess is a long list of ids is causing SRA to timeout. I would suggest processing it in batches just the way you have done while I figure out if it can indeed be fixed.
I ran into the same problem and also solved it with the same approach (iterating over chunks of the accession list).
For querying, this seems to be implemented for SraSearch
already:
https://github.com/saketkc/pysradb/blob/c23d4a769543d05a0f002d1b28c985da5963573f/pysradb/search.py#L757-L760
Would it make sense to do the same for SRAweb
as well? It seems like all terms are simply joined so far:
https://github.com/saketkc/pysradb/blob/c23d4a769543d05a0f002d1b28c985da5963573f/pysradb/sraweb.py#L252-L253
Describe the bug My colleague @Rebecza is trying to download a single-cell ATAC-seq dataset and uses pysradb to get some metadata (seq2science), and managed to find a JsonDecodeError :bug: . It's a list of approx 750 ENA samples, the strange this is the JsonDecodeError appears with the full list, but when split up in smaller lists it seems to work...
To Reproduce I put it on colab, not sure if the link is working https://colab.research.google.com/drive/1bC2WiA63JJnWYZew0pk6iovk537vQzaU?usp=sharing