fatiando / pooch

A friend to fetch your data files
https://www.fatiando.org/pooch
Other
620 stars 74 forks source link

Problem accessing public files on Dropbox #362

Closed tloredo closed 1 year ago

tloredo commented 1 year ago

Thank you for Pooch! I'd appreciate any pointers regarding this problem. Perhaps I need to use the Dropbox API for this purpose, but it's peculiar that wget has no problem with the URLs that Pooch is having trouble with.

This may be a feature request rather than an actual bug (re: Dropbox support).

Description of the problem:

I'm working with a student on a Python package that includes large data files. When the data files are finalized, we'll put them somewhere fairly permanent, like Zenodo or Dataverse. But while we're revising things, I've put the files on Dropbox. But when Pooch retrieves them via their public Dropbox URLs, it gets HTML files with a "Couldn't preview this file" message instead of the data (.h5 files). Yet retrieving the files with wget using the same URLs we're providing to Pooch retrieves the binary .h5 files with no problem.

Full code that generated the error

import pooch

fetcher = pooch.create(
    path=pooch.os_cache("SOAP2-1Spot"),
    base_url="https://www.dropbox.com/sh/nnp8u5mwwgrpoex/AAD6dESXL_WCu44sBGDRuYC9a?dl=0",
    registry={
    "lambda-3923-4010-phases-100.h5" : None,
    "lambda-3923-6664-phases-4.h5" : None
    },
    # Now specify custom URLs for some of the files in the registry.
    urls={
        "lambda-3923-4010-phases-100.h5" : "https://www.dropbox.com/s/5b9m1pq5qif5obf/lambda-3923-4010-phases-100.h5?dl=0",
        "lambda-3923-6664-phases-4.h5" : "https://www.dropbox.com/s/pyeapovhk4q6az0/lambda-3923-6664-phases-4.h5?dl=0"
    },
)

# These paths end up pointing to files named as indicated, but containing HTML
# corresponding to a Dropbox "can't preview" response:
full_spec_path = fetcher.fetch("lambda-3923-6664-phases-4.h5")
ca_spec_path = fetcher.fetch("lambda-3923-4010-phases-100.h5")

Full error message

There is no error message; rather, Pooch triggers an attempted preview from Dropbox instead of accessing the actual data files.

System information

Output of conda list

# packages in environment at /Users/loredo/opt/miniconda3/envs/eprv10:
#
# Name                    Version                   Build  Channel
alabaster                 0.7.13                   pypi_0    pypi
anyio                     3.6.2              pyhd8ed1ab_0    conda-forge
appnope                   0.1.3              pyhd8ed1ab_0    conda-forge
argon2-cffi               21.3.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0          py310h90acd4f_3    conda-forge
arrow-cpp                 11.0.0           h694c41f_5_cpu    conda-forge
asdf                      2.14.3             pyhd8ed1ab_1    conda-forge
asdf-standard             1.0.3              pyhd8ed1ab_0    conda-forge
asdf-transform-schemas    0.3.0              pyhd8ed1ab_0    conda-forge
asdf-unit-schemas         0.1.0              pyhd8ed1ab_0    conda-forge
astroid                   2.15.2                   pypi_0    pypi
astropy                   5.2.1           py310h936d966_0    conda-forge
asttokens                 2.2.1              pyhd8ed1ab_0    conda-forge
attrs                     22.2.0             pyh71513ae_0    conda-forge
aws-c-auth                0.6.24               h8d3cba3_5    conda-forge
aws-c-cal                 0.5.20               hcd718c4_6    conda-forge
aws-c-common              0.8.11               hb7f2c08_0    conda-forge
aws-c-compression         0.2.16               hdd9fac5_3    conda-forge
aws-c-event-stream        0.2.18               hc78af93_6    conda-forge
aws-c-http                0.7.4                h555b4b3_2    conda-forge
aws-c-io                  0.13.17              ha3dd007_2    conda-forge
aws-c-mqtt                0.8.6                h3ff12dd_6    conda-forge
aws-c-s3                  0.2.4                hcc306eb_3    conda-forge
aws-c-sdkutils            0.1.7                hdd9fac5_3    conda-forge
aws-checksums             0.1.14               hdd9fac5_3    conda-forge
aws-crt-cpp               0.19.7               h197cfde_7    conda-forge
aws-sdk-cpp               1.10.57              h6d45362_4    conda-forge
babel                     2.12.1                   pypi_0    pypi
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                pyhd8ed1ab_3    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
backports.zoneinfo        0.2.1           py310h2ec42d9_7    conda-forge
bcrypt                    3.2.2           py310h90acd4f_1    conda-forge
beautifulsoup4            4.11.2             pyha770c72_0    conda-forge
bleach                    6.0.0              pyhd8ed1ab_0    conda-forge
bottleneck                1.3.6           py310h936d966_0    conda-forge
brotli                    1.0.9                hb7f2c08_8    conda-forge
brotli-bin                1.0.9                hb7f2c08_8    conda-forge
brotlipy                  0.7.0           py310h90acd4f_1005    conda-forge
bzip2                     1.0.8                h0d85af4_4    conda-forge
c-ares                    1.18.1               h0d85af4_0    conda-forge
ca-certificates           2023.5.7             h8857fd0_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
certifi                   2023.5.7           pyhd8ed1ab_0    conda-forge
cffi                      1.15.1          py310ha78151a_3    conda-forge
cfgv                      3.3.1                    pypi_0    pypi
charset-normalizer        2.1.1              pyhd8ed1ab_0    conda-forge
click                     8.1.3           unix_pyhd8ed1ab_2    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
colorcet                  3.0.1              pyhd8ed1ab_0    conda-forge
comm                      0.1.2              pyhd8ed1ab_0    conda-forge
contourpy                 1.0.7           py310ha23aa8a_0    conda-forge
copier                    7.1.0              pyhd8ed1ab_0    conda-forge
coverage                  7.2.3                    pypi_0    pypi
cryptography              39.0.2          py310hdd0c95c_0    conda-forge
curl                      7.88.1               h6df9250_0    conda-forge
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
debugpy                   1.6.6           py310h7a76584_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
deprecated                1.2.13                   pypi_0    pypi
dill                      0.3.6                    pypi_0    pypi
distlib                   0.3.6                    pypi_0    pypi
docutils                  0.18.1                   pypi_0    pypi
dunamai                   1.16.0             pyhd8ed1ab_0    conda-forge
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
exceptiongroup            1.1.0              pyhd8ed1ab_0    conda-forge
executing                 1.2.0              pyhd8ed1ab_0    conda-forge
expat                     2.5.0                hf0c8a7f_1    conda-forge
filelock                  3.11.0                   pypi_0    pypi
flit-core                 3.8.0              pyhd8ed1ab_0    conda-forge
fonttools                 4.38.0          py310h90acd4f_1    conda-forge
freetype                  2.12.1               h3f81eb7_1    conda-forge
funcy                     2.0                pyhd8ed1ab_0    conda-forge
gettext                   0.21.1               h8a4c099_0    conda-forge
gflags                    2.2.2             hb1e8313_1004    conda-forge
git                       2.40.0          pl5321h33fe9b8_1    conda-forge
glib                      2.74.1               hbc0c0cd_1    conda-forge
glib-tools                2.74.1               hbc0c0cd_1    conda-forge
glog                      0.6.0                h8ac2a54_0    conda-forge
gsl                       2.7                  h93259b0_0    conda-forge
gst-plugins-base          1.22.0               h37e1711_0    conda-forge
gstreamer                 1.22.0               h1d18e73_0    conda-forge
h5py                      3.8.0           nompi_py310h1de854f_101    conda-forge
hdf5                      1.14.0          nompi_hbf0aa07_102    conda-forge
html5lib                  1.1                pyh9f0ad1d_0    conda-forge
hypothesis                6.68.2             pyha770c72_0    conda-forge
icu                       70.1                 h96cf925_0    conda-forge
identify                  2.5.22                   pypi_0    pypi
idna                      3.4                pyhd8ed1ab_0    conda-forge
imagesize                 1.4.1                    pypi_0    pypi
importlib-metadata        6.0.0              pyha770c72_0    conda-forge
importlib-resources       5.12.0             pyhd8ed1ab_0    conda-forge
importlib_metadata        6.0.0                hd8ed1ab_0    conda-forge
importlib_resources       5.12.0             pyhd8ed1ab_0    conda-forge
iniconfig                 2.0.0                    pypi_0    pypi
ipykernel                 6.21.2             pyh736e0ef_0    conda-forge
ipython                   8.11.0             pyhd1c38e8_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                8.0.4              pyhd8ed1ab_0    conda-forge
isort                     5.12.0                   pypi_0    pypi
jedi                      0.18.2             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.2              pyhd8ed1ab_1    conda-forge
jinja2-ansible-filters    1.3.2              pyhd8ed1ab_0    conda-forge
jmespath                  1.0.1              pyhd8ed1ab_0    conda-forge
jpeg                      9e                   hb7f2c08_3    conda-forge
jplephem                  2.18               pyh78acc04_0    conda-forge
jsonschema                4.17.3             pyhd8ed1ab_0    conda-forge
jupyter                   1.0.0           py310h2ec42d9_8    conda-forge
jupyter_client            8.0.3              pyhd8ed1ab_0    conda-forge
jupyter_console           6.6.2              pyhd8ed1ab_0    conda-forge
jupyter_core              5.2.0           py310h2ec42d9_0    conda-forge
jupyter_events            0.6.3              pyhd8ed1ab_0    conda-forge
jupyter_server            2.3.0              pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.4.4              pyhd8ed1ab_1    conda-forge
jupyterlab_pygments       0.2.2              pyhd8ed1ab_0    conda-forge
jupyterlab_widgets        3.0.5              pyhd8ed1ab_0    conda-forge
kiwisolver                1.4.4           py310ha23aa8a_1    conda-forge
krb5                      1.20.1               h049b76e_0    conda-forge
lazy-object-proxy         1.9.0                    pypi_0    pypi
lcms2                     2.14                 h29502cd_1    conda-forge
lerc                      4.0.0                hb486fe8_0    conda-forge
libabseil                 20220623.0      cxx17_h844d122_6    conda-forge
libaec                    1.0.6                hf0c8a7f_1    conda-forge
libarrow                  11.0.0           h5c283cd_5_cpu    conda-forge
libblas                   3.9.0           16_osx64_openblas    conda-forge
libbrotlicommon           1.0.9                hb7f2c08_8    conda-forge
libbrotlidec              1.0.9                hb7f2c08_8    conda-forge
libbrotlienc              1.0.9                hb7f2c08_8    conda-forge
libcblas                  3.9.0           16_osx64_openblas    conda-forge
libclang                  13.0.1          default_h255f2f3_1    conda-forge
libcrc32c                 1.1.2                he49afe7_0    conda-forge
libcurl                   7.88.1               h6df9250_0    conda-forge
libcxx                    15.0.7               h71dddab_0    conda-forge
libdeflate                1.17                 hac1461d_0    conda-forge
libedit                   3.1.20191231         h0678c8f_2    conda-forge
libev                     4.33                 haf1e3a3_1    conda-forge
libevent                  2.1.10               h7d65743_4    conda-forge
libexpat                  2.5.0                hf0c8a7f_1    conda-forge
libffi                    3.4.2                h0d85af4_5    conda-forge
libgfortran               5.0.0           11_3_0_h97931a8_29    conda-forge
libgfortran5              12.2.0              he409387_29    conda-forge
libglib                   2.74.1               h4c723e1_1    conda-forge
libgoogle-cloud           2.7.0                hb5e37a9_1    conda-forge
libgrpc                   1.51.1               h1ddfa78_1    conda-forge
libiconv                  1.17                 hac89ed1_0    conda-forge
liblapack                 3.9.0           16_osx64_openblas    conda-forge
libllvm13                 13.0.1               h64f94b2_2    conda-forge
libnghttp2                1.52.0               he2ab024_0    conda-forge
libogg                    1.3.4                h35c211d_1    conda-forge
libopenblas               0.3.21          openmp_h429af6e_3    conda-forge
libopus                   1.3.1                hc929b4f_1    conda-forge
libpng                    1.6.39               ha978bb4_0    conda-forge
libpq                     15.2                 h3640bf0_0    conda-forge
libprotobuf               3.21.12              hbc0c0cd_0    conda-forge
libsodium                 1.0.18               hbcb3906_1    conda-forge
libsqlite                 3.40.0               ha978bb4_0    conda-forge
libssh2                   1.10.0               h47af595_3    conda-forge
libthrift                 0.18.0               h16802d8_0    conda-forge
libtiff                   4.5.0                hee9004a_2    conda-forge
libutf8proc               2.8.0                hb7f2c08_0    conda-forge
libvorbis                 1.3.7                h046ec9c_0    conda-forge
libwebp-base              1.2.4                h775f41a_0    conda-forge
libxcb                    1.13              h0d85af4_1004    conda-forge
libzlib                   1.2.13               hfd90126_4    conda-forge
llvm-openmp               15.0.7               h61d9ccf_0    conda-forge
lz4-c                     1.9.4                hf0c8a7f_0    conda-forge
markupsafe                2.1.2           py310h90acd4f_0    conda-forge
matplotlib                3.7.0           py310h2ec42d9_0    conda-forge
matplotlib-base           3.7.0           py310he725631_0    conda-forge
matplotlib-inline         0.1.6              pyhd8ed1ab_0    conda-forge
mccabe                    0.7.0                    pypi_0    pypi
mistune                   2.0.5              pyhd8ed1ab_0    conda-forge
mpmath                    1.2.1              pyhd8ed1ab_0    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
mysql-common              8.0.32               hc4b2c72_0    conda-forge
mysql-libs                8.0.32               h8658499_0    conda-forge
nbclassic                 0.5.2              pyhd8ed1ab_0    conda-forge
nbclient                  0.7.2              pyhd8ed1ab_0    conda-forge
nbconvert                 7.2.9              pyhd8ed1ab_0    conda-forge
nbconvert-core            7.2.9              pyhd8ed1ab_0    conda-forge
nbconvert-pandoc          7.2.9              pyhd8ed1ab_0    conda-forge
nbformat                  5.7.3              pyhd8ed1ab_0    conda-forge
nbsphinx                  0.9.1                    pypi_0    pypi
ncurses                   6.3                  h96cf925_1    conda-forge
nest-asyncio              1.5.6              pyhd8ed1ab_0    conda-forge
nodeenv                   1.7.0                    pypi_0    pypi
notebook                  6.5.2              pyha770c72_1    conda-forge
notebook-shim             0.2.2              pyhd8ed1ab_0    conda-forge
nspr                      4.35                 hea0b92c_0    conda-forge
nss                       3.88                 h78b00b3_0    conda-forge
numpy                     1.24.2          py310h788a5b3_0    conda-forge
openjpeg                  2.5.0                h13ac156_2    conda-forge
openssl                   3.1.0                h8a1eda9_3    conda-forge
orc                       1.8.2                ha9d861c_2    conda-forge
packaging                 23.0               pyhd8ed1ab_0    conda-forge
pandas                    1.5.3           py310hecf8f37_0    conda-forge
pandoc                    2.19.2               h694c41f_1    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
param                     1.12.3             pyh1a96a4e_0    conda-forge
paramiko                  3.1.0              pyhd8ed1ab_0    conda-forge
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
pathspec                  0.11.1             pyhd8ed1ab_0    conda-forge
pcre2                     10.40                h1c4e4bc_0    conda-forge
perl                      5.32.1          2_h0d85af4_perl5    conda-forge
pexpect                   4.8.0              pyh1a96a4e_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    9.4.0           py310h306a057_1    conda-forge
pip                       23.0.1             pyhd8ed1ab_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_0    conda-forge
platformdirs              3.1.0              pyhd8ed1ab_0    conda-forge
pluggy                    1.0.0                    pypi_0    pypi
plumbum                   1.8.1              pyhd8ed1ab_0    conda-forge
ply                       3.11                       py_1    conda-forge
pooch                     1.7.0              pyha770c72_3    conda-forge
pre-commit                3.2.2                    pypi_0    pypi
prometheus_client         0.16.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.38             pyha770c72_0    conda-forge
prompt_toolkit            3.0.38               hd8ed1ab_0    conda-forge
psutil                    5.9.4           py310h90acd4f_0    conda-forge
pthread-stubs             0.4               hc929b4f_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
pyarrow                   11.0.0          py310h435aefc_5_cpu    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pyct                      0.4.6                      py_0    conda-forge
pyct-core                 0.4.6                      py_0    conda-forge
pydantic                  1.10.7          py310h90acd4f_0    conda-forge
pyerfa                    2.0.0.1         py310h936d966_3    conda-forge
pygments                  2.14.0             pyhd8ed1ab_0    conda-forge
pylint                    2.17.2                   pypi_0    pypi
pynacl                    1.5.0           py310h90acd4f_2    conda-forge
pyopenssl                 23.0.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.0.9              pyhd8ed1ab_0    conda-forge
pyqt                      5.15.7          py310hdd03f62_3    conda-forge
pyqt5-sip                 12.11.0         py310h415000c_3    conda-forge
pyrsistent                0.19.3          py310h90acd4f_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
pytest                    7.3.0                    pypi_0    pypi
pytest-cov                4.0.0                    pypi_0    pypi
python                    3.10.9          he7542f4_0_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.16.3             pyhd8ed1ab_0    conda-forge
python-json-logger        2.0.7              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    3_cp310    conda-forge
pytz                      2022.7.1           pyhd8ed1ab_0    conda-forge
pywin32-on-windows        0.1.0              pyh1179c8e_3    conda-forge
pyyaml                    6.0             py310h90acd4f_5    conda-forge
pyyaml-include            1.3                pyhd8ed1ab_0    conda-forge
pyzmq                     25.0.0          py310hf615a82_0    conda-forge
qt-main                   5.15.8               h1d3b3f8_6    conda-forge
qtconsole                 5.4.0              pyhd8ed1ab_0    conda-forge
qtconsole-base            5.4.0              pyha770c72_0    conda-forge
qtpy                      2.3.0              pyhd8ed1ab_0    conda-forge
questionary               1.10.0             pyhd8ed1ab_1    conda-forge
re2                       2023.02.01           hf0c8a7f_0    conda-forge
readline                  8.1.2                h3899abd_0    conda-forge
requests                  2.28.2             pyhd8ed1ab_0    conda-forge
rfc3339-validator         0.1.4              pyhd8ed1ab_0    conda-forge
rfc3986-validator         0.1.1              pyh9f0ad1d_0    conda-forge
scipy                     1.10.1          py310h240c617_0    conda-forge
semantic_version          2.10.0             pyhd8ed1ab_0    conda-forge
send2trash                1.8.0              pyhd8ed1ab_0    conda-forge
setuptools                67.4.0             pyhd8ed1ab_0    conda-forge
sip                       6.7.7           py310h7a76584_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
snappy                    1.1.9                h225ccf5_2    conda-forge
sniffio                   1.3.0              pyhd8ed1ab_0    conda-forge
snowballstemmer           2.2.0                    pypi_0    pypi
soap-gpu-rassine          0.1.dev0                 pypi_0    pypi
sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
soupsieve                 2.3.2.post1        pyhd8ed1ab_0    conda-forge
sphinx                    6.1.3                    pypi_0    pypi
sphinx-autoapi            2.0.1                    pypi_0    pypi
sphinx-rtd-theme          1.2.0                    pypi_0    pypi
sphinxcontrib-applehelp   1.0.4                    pypi_0    pypi
sphinxcontrib-devhelp     1.0.2                    pypi_0    pypi
sphinxcontrib-htmlhelp    2.0.1                    pypi_0    pypi
sphinxcontrib-jquery      4.1                      pypi_0    pypi
sphinxcontrib-jsmath      1.0.1                    pypi_0    pypi
sphinxcontrib-qthelp      1.0.3                    pypi_0    pypi
sphinxcontrib-serializinghtml 1.1.5                    pypi_0    pypi
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
terminado                 0.17.1             pyhd1c38e8_0    conda-forge
tinycss2                  1.2.1              pyhd8ed1ab_0    conda-forge
tk                        8.6.12               h5dbffcc_0    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tomli                     2.0.1                    pypi_0    pypi
tomlkit                   0.11.7                   pypi_0    pypi
tornado                   6.2             py310h90acd4f_1    conda-forge
traitlets                 5.9.0              pyhd8ed1ab_0    conda-forge
typing-extensions         4.4.0                hd8ed1ab_0    conda-forge
typing_extensions         4.4.0              pyha770c72_0    conda-forge
tzdata                    2022g                h191b570_0    conda-forge
unicodedata2              15.0.0          py310h90acd4f_0    conda-forge
unidecode                 1.3.6                    pypi_0    pypi
urllib3                   1.26.14            pyhd8ed1ab_0    conda-forge
virtualenv                20.21.0                  pypi_0    pypi
wcwidth                   0.2.6              pyhd8ed1ab_0    conda-forge
webencodings              0.5.1                      py_1    conda-forge
websocket-client          1.5.1              pyhd8ed1ab_0    conda-forge
wheel                     0.38.4             pyhd8ed1ab_0    conda-forge
widgetsnbextension        4.0.5              pyhd8ed1ab_0    conda-forge
wrapt                     1.15.0                   pypi_0    pypi
xorg-libxau               1.0.9                h35c211d_0    conda-forge
xorg-libxdmcp             1.1.3                h35c211d_0    conda-forge
xz                        5.2.6                h775f41a_0    conda-forge
yaml                      0.2.5                h0d85af4_2    conda-forge
zeromq                    4.3.4                he49afe7_1    conda-forge
zipp                      3.15.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               hfd90126_4    conda-forge
zstd                      1.5.2                hbc0c0cd_6    conda-forge
santisoler commented 1 year ago

Hi @tloredo. Thanks for opening this issue.

What Pooch is doing is downloading the HTML file that Dropbox gives you when you access to https://www.dropbox.com/s/5b9m1pq5qif5obf/lambda-3923-4010-phases-100.h5?dl=0. The Download button is not a regular anchor with a static link, but a dynamic button that triggers the download of the desired file.

After reading Dropbox's docs, you could force it to give you the download link by replacing the trailing dl=0 for dl=1. I just try it out and it downloaded a binary file, which I suppose is the hd5 file you want to fetch.

I'm not sure how wget is able to download the file, even if you pass the dl=0. Maybe it parses the url and uses the one with dl=1 instead.

BTW, if you are going to pass custom urls for every file in the registry, you could use and empty string for your base_url, since it won't be used at any point.

This should work:

import pooch

fetcher = pooch.create(
    path=pooch.os_cache("SOAP2-1Spot"),
    base_url="",
    registry={
    "lambda-3923-4010-phases-100.h5" : None,
    "lambda-3923-6664-phases-4.h5" : None
    },
    # Now specify custom URLs for some of the files in the registry.
    urls={
        "lambda-3923-4010-phases-100.h5" : "https://www.dropbox.com/s/5b9m1pq5qif5obf/lambda-3923-4010-phases-100.h5?dl=1",
        "lambda-3923-6664-phases-4.h5" : "https://www.dropbox.com/s/pyeapovhk4q6az0/lambda-3923-6664-phases-4.h5?dl=1"
    },
)

# These paths end up pointing to files named as indicated, but containing HTML
# corresponding to a Dropbox "can't preview" response:
full_spec_path = fetcher.fetch("lambda-3923-6664-phases-4.h5")
ca_spec_path = fetcher.fetch("lambda-3923-4010-phases-100.h5")
tloredo commented 1 year ago

@santisoler, thanks so much for the quick and helpful response (and the tip about the base url string)! dl=1 does solve this problem. I wonder what wget is doing, but in any case this Pooch issue is solved.

santisoler commented 1 year ago

Glad to be helpful! 🙂

tloredo commented 1 year ago

Just a further followup: Some poking on Stack Exchange, after geting @santisoler's solution, suggests that Dropbox recognizes some user agents and handles requests from them in special ways. See, e.g., linux - how to download dropbox files using wget command? - Super User and curl - User-Agent affects Dropbox shared links download - Stack Overflow. The lesson being that having a Dropbox URL work with some user agents doesn't mean that URL will work for others.