saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
303 stars 49 forks source link

Specific srp seems to cause metadata download to freeze #218

Open nasjr08 opened 3 months ago

nasjr08 commented 3 months ago

I ran the following code:

pysradb metadata --detailed SRP131661

And I neither got an error message nor an output.

Running the following does work however: pysradb metadata SRP131661

When I retrieve the srx files using the srp-to-srx command and use them instead, I notice that up until the 28th SRX id, the command works fine.

Including the 29th SRX ID leads to the same problem as above. Excluding the 29th ID but running the rest also causes a timeout. Including the 1-28 + the last SRX ID causes the output to have 2638 rows with the majority not belonging to the study of interest. I have attached the four files to demonstrate what I mean.

This is really odd behaviour. What is the explanation for this?

srp-to-srx 1-28&46.txt srp-to-srx.txt output_metadata_detailed.txt get_metadata_detailed.txt

UPDATE: I ran with 4 srx now ( SRX3607048, SRX3791765, SRX3791771, SRX3791772) and again, I get over 1900 rows. all but the last 4 have an experiment_accession of SRX3607048. Ag liver_samples_metadata_detailed.txt ain, results appear to be very odd.

selected_samples_metadata_detailed.txt