saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
311 stars 51 forks source link

download fails for SRP125265 #40

Closed et-jaynes-in-a-banana-suit closed 4 years ago

et-jaynes-in-a-banana-suit commented 4 years ago

Running the following example from readme (different SRP)

pysradb metadata SRP125265 --assay | grep 'study\|RNAseq' | pysradb download

produces this error

Traceback (most recent call last):
  File "/home/dfeldman/.conda/envs/df-pyr/bin/pysradb", line 10, in <module>
    sys.exit(parse_args())
  File "/home/dfeldman/.conda/envs/df-pyr/lib/python3.7/site-packages/pysradb/cli.py", line 1044, in parse_args
    download(args.out_dir, args.db, args.srx, args.srp, args.skip_confirmation)
  File "/home/dfeldman/.conda/envs/df-pyr/lib/python3.7/site-packages/pysradb/cli.py", line 134, in download
    protocol=protocol,
  File "/home/dfeldman/.conda/envs/df-pyr/lib/python3.7/site-packages/pysradb/sradb.py", line 1275, in download
    + ".sra"
  File "/home/dfeldman/.conda/envs/df-pyr/lib/python3.7/site-packages/pandas/core/generic.py", line 5270, in __getattr__
    return object.__getattribute__(self, name)
  File "/home/dfeldman/.conda/envs/df-pyr/lib/python3.7/site-packages/pandas/core/accessor.py", line 187, in __get__
    accessor_obj = self._accessor(obj)
  File "/home/dfeldman/.conda/envs/df-pyr/lib/python3.7/site-packages/pandas/core/strings.py", line 2039, in __init__
    self._inferred_dtype = self._validate(data)
  File "/home/dfeldman/.conda/envs/df-pyr/lib/python3.7/site-packages/pandas/core/strings.py", line 2096, in _validate
    raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!

Possible this is failing because sample_title is blank and whitespace is not being handled correctly in cli.download. The downloaded metadata loads fine with pd.read_fwf.

saketkc commented 4 years ago

Hi @et-jaynes-in-a-banana-suit, thanks for the bug report. --detailed is required for download to work. The docs require updating, but this should work:

Note that --assay is redundant in the SRAweb mode.

pysradb metadata SRP125265 --detailed | grep 'study\|RNAseq' | pysradb download
et-jaynes-in-a-banana-suit commented 4 years ago

Hmm, I tried that exact command and I'm still getting AttributeError: Can only use .str accessor with string values!

saketkc commented 4 years ago

Are you using SRAmetadb.sqlite? Here's a notebook showing its usage: https://colab.research.google.com/drive/1CPruUgFYBE3L7MyLg9exWzON8dv_3QxQ

et-jaynes-in-a-banana-suit commented 4 years ago

No, I'm not using the local database (it's rather large).

saketkc commented 4 years ago

What version of pysradb are you using?

saketkc commented 4 years ago

I was not super clear on my previous comment but the Colab notebook shows the example of the SRAweb mode. https://colab.research.google.com/drive/1CPruUgFYBE3L7MyLg9exWzON8dv_3QxQ

et-jaynes-in-a-banana-suit commented 4 years ago

pysradb==0.9.7, full output of pip freeze below

asn1crypto==1.3.0
attrs==19.3.0
backcall==0.1.0
better-exceptions==0.2.2
biopython==1.76
bleach==3.1.4
blosc==1.7.0
certifi==2020.4.5.1
cffi==1.14.0
chardet==3.0.4
cryptography==2.8
cycler==0.10.0
decorator==4.4.1
defusedxml==0.6.0
entrypoints==0.3
idisplay==0.1.2
idna==2.9
importlib-metadata==1.6.0
ipykernel==5.1.4
ipython==7.12.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
jedi==0.16.0
Jinja2==2.11.1
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==5.3.4
jupyter-console==6.1.0
jupyter-contrib-core==0.3.3
jupyter-contrib-nbextensions==0.5.1
jupyter-core==4.6.1
jupyter-highlight-selected-word==0.2.0
jupyter-latex-envs==1.4.6
jupyter-nbextensions-configurator==0.4.1
kiwisolver==1.1.0
lxml==4.5.0
MarkupSafe==1.1.1
matplotlib==3.1.3
mistune==0.8.4
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
natsort==7.0.1
nbconvert==5.6.1
nbformat==5.0.5
notebook==6.0.3
numpy==1.18.1
pandas==1.0.1
pandocfilters==1.4.2
parso==0.6.1
pdb-tools==2.0.1
pexpect==4.8.0
pickleshare==0.7.5
prometheus-client==0.7.1
prompt-toolkit==3.0.3
ptyprocess==0.6.0
py3Dmol==0.8.0
pycparser==2.20
Pygments==2.5.2
pyOpenSSL==19.1.0
pyparsing==2.4.6
pyrosetta==2020.8+release.cb1caba
pyrsistent==0.16.0
PySocks==1.7.1
pysradb==0.9.7
pyteomics==4.2
python-dateutil==2.8.1
python-Levenshtein==0.12.0
pytz==2019.3
PyYAML==5.3.1
pyzmq==18.1.1
qtconsole==4.7.2
QtPy==1.9.0
requests==2.23.0
scipy==1.4.1
seaborn==0.10.0
Send2Trash==1.5.0
six==1.14.0
SQLAlchemy==1.3.13
terminado==0.8.3
testpath==0.4.4
tornado==6.0.3
tqdm==4.44.1
traitlets==4.3.3
urllib3==1.25.8
wcwidth==0.1.8
webencodings==0.5.1
widgetsnbextension==3.5.1
xlrd==1.2.0
xmltodict==0.12.0
zipp==3.1.0
saketkc commented 4 years ago

Can you upgrade to 0.10.3 (preferably in a clean virtualenv or conda environment)?

pip install pysradb==0.10.3

et-jaynes-in-a-banana-suit commented 4 years ago

That's working! I got the previous version from conda -c bioconda today.

saketkc commented 4 years ago

Thanks for letting me know! Reminds me I need to get https://github.com/bioconda/bioconda-recipes/pull/21135 to build.

Closing this as it seems to be fixed. Please feel free to reopen if you face any issues.