saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
303 stars 49 forks source link

[BUG] gse_to_srp returns an error in Python API #187

Closed ajandria closed 1 year ago

ajandria commented 1 year ago

Describe the bug gse_to_srp returns an error if using Python API

File [~/opt/anaconda3/envs/fetch_study_sra_metadata/lib/python3.11/site-packages/pandas/core/frame.py:708](https://file+.vscode-resource.vscode-cdn.net/Users/adrian/General_projects/fetch_study_sra_metadata/~/opt/anaconda3/envs/fetch_study_sra_metadata/lib/python3.11/site-packages/pandas/core/frame.py:708), in DataFrame.__init__(self, data, index, columns, dtype, copy)
    702     mgr = self._init_mgr(
    703         data, axes={"index": index, "columns": columns}, dtype=dtype, copy=copy
    704     )
    706 elif isinstance(data, dict):
    707     # GH#38939 de facto copy defaults to False only in non-dict cases
...
    658     raise ValueError(
    659         "Mixing dicts with non-Series may lead to ambiguous ordering."
    660     )

ValueError: All arrays must be of the same length

To Reproduce Steps to reproduce the behavior:

# Import libs
from pysradb.sraweb import SRAweb
db = SRAweb()
df = db.gse_to_srp("GSE200028")
df

Desktop (please complete the following information):

Additional context I think this has been already mentioned here https://github.com/saketkc/pysradb/issues/186 and indeed works if using CLI, but when using Python API it throws an error. I'm trying to create a Jupyter Notebook for metadata curation thus I would like to use it via Python API.

saketkc commented 1 year ago

Thanks, https://github.com/saketkc/pysradb/commit/edd5bd158d78e63bc2310706e9da4847512466a9 should fix it. You can install the development version by pip install git+https://github.com/saketkc/pysradb