Open jhamman opened 4 years ago
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here or remove the stale
label; otherwise it will be marked as closed automatically
This is happenning to me too. For the record, I need to use engine='pydap'
because it supports authentification to access private thredds data programmatically (by passing session
), while netCDF4
does not.
I pointed to this issue in the related (but currently badly named and stale) issue upstream.
What happened:
I'm getting a maximum recursion error when using the Pydap backend with Dask distributed. It seems the we're failing to successfully pickle the pydap backend store.
What you expected to happen:
Successful parallel loading of opendap dataset.
Minimal Complete Verifiable Example:
yields:
Killed worker on the client:
and the above mentioned recursion error on the workers:
Anything else we need to know?:
I've found this to be reproducible with a few kinds of Dask clusters. Setting
Client(processes=False)
does correct the problem at the expense of multiprocessiing.Environment:
Output of xr.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.8.2 | packaged by conda-forge | (default, Mar 5 2020, 16:54:44) [Clang 9.0.1 ] python-bits: 64 OS: Darwin OS-release: 19.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: installed h5netcdf: 0.8.0 h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: 1.1.1.2 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.0.28 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.13.0 distributed: 2.13.0 matplotlib: 3.2.1 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: installed setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: installed pytest: 5.4.1 IPython: 7.13.0 sphinx: 3.1.1