Closed peterroelants closed 3 years ago
Datashader supports Dask DataFrames, not Dask Arrays; for n-dimensional arrays use an Xarray DataArray instead. I don't think it would be difficult to support a Dask Array, but for all the use cases we've contemplated a DataArray (backed by Dask) is strictly a superset of a Dask Array, and should have the same performance.
I don't think have been clear enough. I'm using a XArray array, the XArray happens to have a Dask array as data. This combination is claimed to be supported by https://datashader.org/user_guide/Performance.html (Xarray+DaskArray).
I'm getting the error on a Dask Array since utils.orient_array
extracts the Dask data array via .data
from the xarray raster. This Dask array is then further passed to eq_hist
.
I think I can fix this issue by forcing a compute()
when the data object is a dask array in transfer_functions._interpolate
just after orient_array
returns.
I can try a PR if you think this would be a good first stab at the issue?
Ah, I see. Looking at your gist, yes, calling .compute()
before shade()
will fix it:
cvs = datashader.Canvas(plot_width=900, plot_height=400)
agg = cvs.raster(data_da, agg=datashader.reductions.mean('z'))
img = datashader.transfer_functions.shade(agg.compute())
img
So yes, it would be great to see a PR to shade()
to call .compute() first if it sees a Dask Array-backed DataArray. In the meantime, just call .compute()
before calling shade()
.
I tried a first stab at fixing this issue: https://github.com/holoviz/datashader/pull/971
Would love some help getting this in if you think it's ok.
Thanks for the fix!
Thanks for merging this in!
I might also have a look at the QuadMesh soon, last time I checked it tried to load my whole DaskArray in memory. Are you aware of any issues there that I could look into?
QuadMesh should support Dask-backed Xarray quadmeshes properly since version 0.11.0 (see https://github.com/holoviz/datashader/pull/885), and I don't know of any regressions introduced in 0.11.1 or in master. So the first step would be to make a reproducible example of any bug or problem, and we can go from there. Thanks!
QuadMesh should support Dask-backed Xarray quadmeshes properly since version 0.11.0 (see #885), and I don't know of any regressions introduced in 0.11.1 or in master. So the first step would be to make a reproducible example of any bug or problem, and we can go from there. Thanks!
I created an example of what I meant and filed an issue at https://github.com/holoviz/datashader/issues/972 .
When trying to visualize a Dask array with
raster
andshade
I get aTypeError: data must be an ndarray
error in theeq_hist
method. From the DataShader documentation I expected Dask arrays to be supported in DataShader.I have provided a minimal notebook at https://gist.github.com/peterroelants/1d77e09bd05cc55c240bc11983e2a0c4 to reproduce the error.
ALL software version info
Description of expected behavior and the observed behavior
I expect
shade
being able to visualise a Dask array.Complete, minimal, self-contained example code that reproduces the issue
https://gist.github.com/peterroelants/1d77e09bd05cc55c240bc11983e2a0c4
Stack traceback and/or browser JavaScript console output
Where
data
is adask.array.core.Array
.