saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
303 stars 49 forks source link

[BUG] The error arises from setting a deprecated value for the "display.max_colwidth" option in pandas. #189

Closed xuzhougeng closed 1 year ago

xuzhougeng commented 1 year ago

Describe the bug

pysradb download -y -t 8 --out-dir ./pysradb_downloads -p SRP000941
Checking download URLs
The following files will be downloaded: 

Traceback (most recent call last):
  File "/miniconda3/envs/rna-seq/bin/pysradb", line 10, in <module>
    sys.exit(parse_args())
  File "/miniconda3/envs/rna-seq/lib/python3.10/site-packages/pysradb/cli.py", line 1189, in parse_args
    download(
  File "//miniconda3/envs/rna-seq/lib/python3.10/site-packages/pysradb/cli.py", line 125, in download
    sradb.download(
  File "//miniconda3/envs/rna-seq/lib/python3.10/site-packages/pysradb/sradb.py", line 1538, in download
    pd.set_option("display.max_colwidth", -1)
  File "/h/miniconda3/envs/rna-seq/lib/python3.10/site-packages/pandas/_config/config.py", line 261, in __call__
    return self.__func__(*args, **kwds)
  File "/miniconda3/envs/rna-seq/lib/python3.10/site-packages/pandas/_config/config.py", line 160, in _set_option
    o.validator(v)
  File "//miniconda3/envs/rna-seq/lib/python3.10/site-packages/pandas/_config/config.py", line 882, in is_nonnegative_int
    raise ValueError(msg)
ValueError: Value must be a nonnegative integer or None

To Reproduce

Using the latest pandas

 conda list | grep pandas
pandas                    2.0.0           py310h9b08913_0    conda-forge

Desktop (please complete the following information):

Additional context

To fix this, we should modify the pysradb source code:

a. Locate the sradb.py file in your environment. In your case, it should be at /miniconda3/envs/rna-seq/lib/python3.10/site-packages/pysradb/sradb.py.

b. Open the file in a text editor, and search for the line:

pd.set_option("display.max_colwidth", -1)

c. Replace that line with:

pd.set_option("display.max_colwidth", None)

d. Save the file and try running the command again.

saketkc commented 1 year ago

For downloading I would recommend saving the metadata pysradb metadata SRPXXX --saveto x.tsv and then using a tool like wget or curl to download (with links drawn from an appropriate column).

saketkc commented 1 year ago

Thanks! this is now fixed on develop