xarray-contrib / pint-xarray

Interface for using pint with xarray, providing convenience accessors
https://pint-xarray.readthedocs.io/en/latest/
Apache License 2.0
105 stars 12 forks source link

UserWarning when computing chunked pint arrays #116

Closed TomNicholas closed 3 years ago

TomNicholas commented 3 years ago
da = xr.DataArray([1,2,3], dims=['x'], attrs={'units': 'metres'})

chunked = da.pint.quantify().pint.chunk(1)
# chunked2 = da.chunk(1).pint.quantify()  # also happens if I do it in this order instead

Everything looks fine here, excellent...

Screenshot from 2021-07-01 12-31-28

but when I go to compute then I get a UserWarning, even though it returns the correct answer:

chunked.mean().compute()
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/dask/array/core.py:3139: 
UserWarning: Passing an object to dask.array.from_array which is already a Dask collection. This can lead to unexpected behavior.
  warnings.warn(

Even if this is working fine then we don't want to be giving warnings to the user ideally.

keewis commented 3 years ago

you don't even need the compute to get the warning:

In [3]: chunked.mean()
.../lib/python3.8/site-packages/dask/array/core.py:3113: UserWarning: Passing an object to dask.array.from_array which is already a Dask collection. This can lead to unexpected behavior.
  warnings.warn(
Out[3]: 
<xarray.DataArray ()>
dask.array<mean_agg-aggregate, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>

is enough, and computing returns

<xarray.DataArray ()>
<Quantity(dask.array<true_divide, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>, 'meter')>

Note that there's no units in the result of .mean(), that the return value of compute is a dask array (wrapped by pint) and that we need to compute twice to get the actual result.

In conclusion: this is a pretty serious bug (in xarray, I think?) and the warning should actually be an error in this case.

TomNicholas commented 3 years ago

Oh dear. Does the other order (da.chunk(1).pint.quantify()) behave any differently?

keewis commented 3 years ago

no, it doesn't, which is why I believe this is a bug in xarray

TomNicholas commented 3 years ago

It would be really nice to get this to work before we publish #114 (not that there is any time limit), but I have time now and am keen to help if I can. Should I re-raise this issue on xarray?

keewis commented 3 years ago

yes, that would be good.

I didn't test xarray(pint(dask)) thoroughly, yet, so I guess we can expect more to fail. I really hope pydata/xarray#4972 would have caught something like this, which I guess means I should try to finalize that as soon as possible.

TomNicholas commented 3 years ago

Note that there's no units in the result of .mean(), that the return value of compute is a dask array (wrapped by pint) and that we need to compute twice to get the actual result.

Are we definitely seeing the same behaviour as each other? When I do print(chunked.compute()) (after chunking in either way) I get

<xarray.DataArray (dim_0: 3)>
<Quantity([1 2 3], 'meter')>
Dimensions without coordinates: dim_0

which seems right to me?

Conda env # packages in environment at /home/tegn500/miniconda3/envs/py38-mamba: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge alsa-lib 1.2.3 h516909a_0 conda-forge anyio 3.1.0 py38h578d9bd_0 conda-forge appdirs 1.4.4 pyh9f0ad1d_0 conda-forge argon2-cffi 20.1.0 py38h497a2fe_2 conda-forge async_generator 1.10 py_0 conda-forge atk-1.0 2.36.0 h3371d22_4 conda-forge attrs 21.2.0 pyhd8ed1ab_0 conda-forge babel 2.9.1 pyh44b312d_0 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 py_2 conda-forge backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge black 21.5b0 pyhd8ed1ab_0 conda-forge bleach 3.3.0 pyh44b312d_0 conda-forge bokeh 2.3.2 py38h578d9bd_0 conda-forge bottleneck 1.3.2 py38h5c078b8_3 conda-forge brotlipy 0.7.0 py38h497a2fe_1001 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.17.1 h7f98852_1 conda-forge ca-certificates 2021.5.30 ha878542_0 conda-forge cairo 1.16.0 h6cf1ce9_1008 conda-forge certifi 2021.5.30 py38h578d9bd_0 conda-forge cffi 1.14.5 py38ha65f79e_0 conda-forge cfgv 3.2.0 py_0 conda-forge cftime 1.4.1 py38h5c078b8_0 conda-forge chardet 4.0.0 py38h578d9bd_1 conda-forge click 8.0.1 py38h578d9bd_0 conda-forge cloudpickle 1.6.0 py_0 conda-forge conda 4.10.1 py38h578d9bd_0 conda-forge conda-package-handling 1.7.3 py38h497a2fe_0 conda-forge cryptography 3.4.7 py38ha5dfef3_0 conda-forge curl 7.76.1 hea6ffbf_2 conda-forge cycler 0.10.0 py_2 conda-forge cytoolz 0.11.0 py38h497a2fe_3 conda-forge dask 2021.5.0 pyhd8ed1ab_0 conda-forge dask-core 2021.5.0 pyhd8ed1ab_0 conda-forge dataclasses 0.8 pyhc8e2a94_1 conda-forge dbus 1.13.6 h48d8840_2 conda-forge decorator 5.0.9 pyhd8ed1ab_0 conda-forge defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge distlib 0.3.1 pyh9f0ad1d_0 conda-forge distributed 2021.5.0 py38h578d9bd_0 conda-forge editdistance-s 1.0.0 py38h1fd1430_1 conda-forge entrypoints 0.3 py38h32f6830_1002 conda-forge expat 2.3.0 h9c3ff4c_0 conda-forge filelock 3.0.12 pyh9f0ad1d_0 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 hab24e00_0 conda-forge fontconfig 2.13.1 hba837de_1005 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge freetype 2.10.4 h0708190_1 conda-forge fribidi 1.0.10 h516909a_0 conda-forge fsspec 2021.5.0 pyhd8ed1ab_0 conda-forge gdk-pixbuf 2.42.6 h04a7f16_0 conda-forge gettext 0.19.8.1 h0b5b191_1005 conda-forge giflib 5.2.1 h516909a_2 conda-forge glib 2.68.2 h9c3ff4c_0 conda-forge glib-tools 2.68.2 h9c3ff4c_0 conda-forge graphite2 1.3.13 he1b5a44_1001 conda-forge graphviz 2.47.1 h85b4f2f_1 conda-forge gst-plugins-base 1.18.4 hf529b03_2 conda-forge gstreamer 1.18.4 h76c114f_2 conda-forge gtk2 2.24.33 h539f30e_1 conda-forge gts 0.7.6 h64030ff_2 conda-forge harfbuzz 2.8.1 h83ec7ef_0 conda-forge hdf4 4.2.15 h10796ff_3 conda-forge hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge heapdict 1.0.1 py_0 conda-forge hypothesis 6.13.0 pyhd8ed1ab_0 conda-forge icu 68.1 h58526e2_0 conda-forge identify 2.2.6 pyhd8ed1ab_0 conda-forge idna 2.10 pyh9f0ad1d_0 conda-forge importlib-metadata 4.0.1 py38h578d9bd_0 conda-forge importlib_metadata 4.0.1 hd8ed1ab_0 conda-forge importlib_resources 5.2.0 pyhd8ed1ab_0 conda-forge iniconfig 1.1.1 pyh9f0ad1d_0 conda-forge ipykernel 5.5.5 py38hd0cf306_0 conda-forge ipytest 0.9.1 pypi_0 pypi ipython 7.23.1 py38hd0cf306_0 conda-forge ipython_genutils 0.2.0 py_1 conda-forge jedi 0.18.0 py38h578d9bd_2 conda-forge jinja2 3.0.1 pyhd8ed1ab_0 conda-forge jpeg 9d h516909a_0 conda-forge json5 0.9.5 pyh9f0ad1d_0 conda-forge jsonschema 3.2.0 py38h32f6830_1 conda-forge jupyter_client 6.1.12 pyhd8ed1ab_0 conda-forge jupyter_core 4.7.1 py38h578d9bd_0 conda-forge jupyter_server 1.8.0 pyhd8ed1ab_0 conda-forge jupyterlab 3.0.16 pyhd8ed1ab_0 conda-forge jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge jupyterlab_server 2.5.2 pyhd8ed1ab_0 conda-forge kiwisolver 1.3.1 py38h1fd1430_1 conda-forge krb5 1.19.1 hcc1bbae_0 conda-forge lcms2 2.12 hddcbb42_0 conda-forge ld_impl_linux-64 2.35.1 hea4e1c9_2 conda-forge libarchive 3.5.1 h3f442fb_1 conda-forge libblas 3.9.0 9_openblas conda-forge libcblas 3.9.0 9_openblas conda-forge libclang 11.1.0 default_ha53f305_1 conda-forge libcurl 7.76.1 h2574ce0_2 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.10 hcdb4288_3 conda-forge libffi 3.3 h58526e2_2 conda-forge libgcc-ng 9.3.0 h2828fa1_19 conda-forge libgd 2.3.2 h78a0170_0 conda-forge libgfortran-ng 9.3.0 hff62375_19 conda-forge libgfortran5 9.3.0 hff62375_19 conda-forge libglib 2.68.2 h3e27bee_0 conda-forge libgomp 9.3.0 h2828fa1_19 conda-forge libiconv 1.16 h516909a_0 conda-forge liblapack 3.9.0 9_openblas conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libllvm11 11.1.0 hf817b99_2 conda-forge libnetcdf 4.8.0 nompi_hcd642e3_103 conda-forge libnghttp2 1.43.0 h812cca2_0 conda-forge libogg 1.3.4 h7f98852_1 conda-forge libopenblas 0.3.15 pthreads_h8fe5266_1 conda-forge libopus 1.3.1 h7f98852_1 conda-forge libpng 1.6.37 hed695b0_2 conda-forge libpq 13.3 hd57d9b9_0 conda-forge librsvg 2.50.5 hc3c00ef_0 conda-forge libsodium 1.0.18 h516909a_1 conda-forge libsolv 0.7.18 h780b84a_0 conda-forge libssh2 1.9.0 ha56f1ee_6 conda-forge libstdcxx-ng 9.3.0 h6de172a_19 conda-forge libtiff 4.2.0 hbd63e13_2 conda-forge libtool 2.4.6 h58526e2_1007 conda-forge libuuid 2.32.1 h14c3975_1000 conda-forge libvorbis 1.3.7 he1b5a44_0 conda-forge libwebp 1.2.0 h3452ae3_0 conda-forge libwebp-base 1.2.0 h7f98852_2 conda-forge libxcb 1.13 h7f98852_1003 conda-forge libxkbcommon 1.0.3 he3ba5ed_0 conda-forge libxml2 2.9.12 h72842e0_0 conda-forge libzip 1.7.3 he9f05b3_0 conda-forge llvmlite 0.36.0 py38h4630a5e_0 conda-forge locket 0.2.0 py_2 conda-forge lz4-c 1.9.3 h9c3ff4c_0 conda-forge lzo 2.10 h516909a_1000 conda-forge mamba 0.13.0 py38h2aa5da1_0 conda-forge markupsafe 2.0.1 py38h497a2fe_0 conda-forge matplotlib 3.4.2 py38h578d9bd_0 conda-forge matplotlib-base 3.4.2 py38hcc49a3a_0 conda-forge matplotlib-inline 0.1.2 pyhd8ed1ab_2 conda-forge mistune 0.8.4 py38h497a2fe_1003 conda-forge more-itertools 8.7.0 pyhd8ed1ab_1 conda-forge msgpack-python 1.0.2 py38h1fd1430_1 conda-forge mypy 0.812 py38h497a2fe_2 conda-forge mypy_extensions 0.4.3 py38h578d9bd_3 conda-forge mysql-common 8.0.23 ha770c72_2 conda-forge mysql-libs 8.0.23 h935591d_2 conda-forge nbclassic 0.3.1 pyhd8ed1ab_1 conda-forge nbclient 0.5.3 pyhd8ed1ab_0 conda-forge nbconvert 6.0.7 py38h578d9bd_3 conda-forge nbformat 5.1.3 pyhd8ed1ab_0 conda-forge ncurses 6.2 h58526e2_4 conda-forge nest-asyncio 1.5.1 pyhd8ed1ab_0 conda-forge netcdf4 1.5.6 nompi_py38h5e9db54_103 conda-forge nodeenv 1.6.0 pyhd8ed1ab_0 conda-forge notebook 6.4.0 pyha770c72_0 conda-forge nspr 4.30 h9c3ff4c_0 conda-forge nss 3.65 hb5efdd6_0 conda-forge numba 0.53.1 py38h0e12cce_0 conda-forge numpy 1.20.3 py38h9894fe3_0 conda-forge numpy_groupies 0.9.13 pyh9f0ad1d_1 conda-forge olefile 0.46 pyh9f0ad1d_1 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openssl 1.1.1k h7f98852_0 conda-forge packaging 20.9 pyh44b312d_0 conda-forge pandas 1.2.4 py38h1abd341_0 conda-forge pandoc 2.13 h7f98852_0 conda-forge pandocfilters 1.4.2 py_1 conda-forge pango 1.48.5 hb8ff022_0 conda-forge parso 0.8.2 pyhd8ed1ab_0 conda-forge partd 1.2.0 pyhd8ed1ab_0 conda-forge pathspec 0.8.1 pyhd3deb0d_0 conda-forge pcre 8.44 he1b5a44_0 conda-forge pexpect 4.8.0 py38h32f6830_1 conda-forge pickleshare 0.7.5 py38h32f6830_1002 conda-forge pillow 8.2.0 py38ha0e1e83_1 conda-forge pint 0.17 pyhd8ed1ab_0 conda-forge pint-xarray 0.2 pyhd8ed1ab_0 conda-forge pip 21.1.1 pyhd8ed1ab_0 conda-forge pixman 0.40.0 h36c2ea0_0 conda-forge pluggy 0.13.1 py38h578d9bd_4 conda-forge pooch 1.4.0 pyhd8ed1ab_0 conda-forge pre-commit 2.12.1 py38h578d9bd_0 conda-forge prometheus_client 0.10.1 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.18 pyha770c72_0 conda-forge psutil 5.8.0 py38h497a2fe_1 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge py 1.10.0 pyhd3deb0d_0 conda-forge pycosat 0.6.3 py38h497a2fe_1006 conda-forge pycparser 2.20 pyh9f0ad1d_2 conda-forge pygments 2.9.0 pyhd8ed1ab_0 conda-forge pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge pyqt 5.12.3 py38h578d9bd_7 conda-forge pyqt-impl 5.12.3 py38h7400c14_7 conda-forge pyqt5-sip 4.19.18 py38h709712a_7 conda-forge pyqtchart 5.12 py38h7400c14_7 conda-forge pyqtwebengine 5.12.1 py38h7400c14_7 conda-forge pyrsistent 0.17.3 py38h497a2fe_2 conda-forge pysocks 1.7.1 py38h578d9bd_3 conda-forge pytest 6.2.4 py38h578d9bd_0 conda-forge pytest-repeat 0.9.1 pypi_0 pypi python 3.8.10 h49503c6_1_cpython conda-forge python-dateutil 2.8.1 py_0 conda-forge python-graphviz 0.16 pyh243d235_2 conda-forge python_abi 3.8 1_cp38 conda-forge pytz 2021.1 pyhd8ed1ab_0 conda-forge pyyaml 5.4.1 py38h497a2fe_0 conda-forge pyzmq 22.1.0 py38h2035c66_0 conda-forge qt 5.12.9 hda022c4_4 conda-forge readline 8.1 h46c0cb4_0 conda-forge regex 2021.4.4 py38h497a2fe_0 conda-forge reproc 14.2.1 h36c2ea0_0 conda-forge reproc-cpp 14.2.1 h58526e2_0 conda-forge requests 2.25.1 pyhd3deb0d_0 conda-forge ruamel_yaml 0.15.80 py38h497a2fe_1004 conda-forge scipy 1.6.3 py38h7b17777_0 conda-forge send2trash 1.5.0 py_0 conda-forge setuptools 49.6.0 py38h578d9bd_3 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge sniffio 1.2.0 py38h578d9bd_1 conda-forge sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge sqlite 3.35.5 h74cdb3f_0 conda-forge tblib 1.7.0 pyhd8ed1ab_0 conda-forge terminado 0.10.0 py38h578d9bd_0 conda-forge testpath 0.5.0 pyhd8ed1ab_0 conda-forge tk 8.6.10 h21135ba_1 conda-forge toml 0.10.2 pyhd8ed1ab_0 conda-forge toolz 0.11.1 py_0 conda-forge tornado 6.1 py38h497a2fe_1 conda-forge tqdm 4.60.0 pyhd8ed1ab_0 conda-forge traitlets 5.0.5 py_0 conda-forge typed-ast 1.4.3 py38h497a2fe_0 conda-forge typing-extensions 3.10.0.0 hd8ed1ab_0 conda-forge typing_extensions 3.10.0.0 pyha770c72_0 conda-forge tzdata 2021a he74cb21_0 conda-forge urllib3 1.26.4 pyhd8ed1ab_0 conda-forge virtualenv 20.4.7 py38h578d9bd_0 conda-forge wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge webencodings 0.5.1 py_1 conda-forge websocket-client 0.57.0 py38h578d9bd_4 conda-forge wheel 0.36.2 pyhd3deb0d_0 conda-forge xarray 0.18.2 pyhd8ed1ab_0 conda-forge xhistogram 0.1.3+40.g9f20e95.dirty dev_0 xorg-kbproto 1.0.7 h14c3975_1002 conda-forge xorg-libice 1.0.10 h516909a_0 conda-forge xorg-libsm 1.2.3 hd9c2040_1000 conda-forge xorg-libx11 1.7.1 h7f98852_0 conda-forge xorg-libxau 1.0.9 h14c3975_0 conda-forge xorg-libxdmcp 1.1.3 h516909a_0 conda-forge xorg-libxext 1.3.4 h7f98852_1 conda-forge xorg-libxrender 0.9.10 h7f98852_1003 conda-forge xorg-renderproto 0.11.1 h14c3975_1002 conda-forge xorg-xextproto 7.3.0 h14c3975_1002 conda-forge xorg-xproto 7.0.31 h14c3975_1007 conda-forge xz 5.2.5 h516909a_1 conda-forge yaml 0.2.5 h516909a_0 conda-forge zeromq 4.3.4 h9c3ff4c_0 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.4.1 pyhd8ed1ab_0 conda-forge zlib 1.2.11 h516909a_1010 conda-forge zstd 1.4.9 ha95c52a_0 conda-forge
keewis commented 3 years ago

it is correct and I get the same result (which means .pint.chunk does not have a bug), but chunked.mean() is definitely wrong (I checked both master and v0.2)

keewis commented 3 years ago

with .compute I meant that chunked.mean().compute().compute() is required to get the result for the mean

TomNicholas commented 3 years ago

Right sorry, I had left out the call to mean.

TomNicholas commented 3 years ago

This was fixed by https://github.com/pydata/xarray/issues/5559

In [4]: da = xr.DataArray([1,2,3], dims=['x'], attrs={'units': 'metres'})

In [5]: chunked = da.pint.quantify().pint.chunk(1)

In [6]: chunked
Out[6]: 
<xarray.DataArray (x: 3)>
<Quantity(dask.array<xarray-<this-array>, shape=(3,), dtype=int64, chunksize=(1,), chunktype=numpy.ndarray>, 'meter')>
Dimensions without coordinates: x

In [7]: chunked.mean().compute()
Out[7]: 
<xarray.DataArray ()>
<Quantity(2.0, 'meter')>