Handle dask arrays - Githubissues

intake / xrviz

Interactive visualisation interface for Xarrays

https://xrviz.readthedocs.io

BSD 3-Clause "New" or "Revised" License

105 stars 21 forks source link

Handle dask arrays #37

Closed hdsingh closed 4 years ago

hdsingh commented 4 years ago

Now we can create plot for dask arrays.

import xarray as xr
from xrviz.dashboard import Dashboard

ds = xr.tutorial.open_dataset('air_temperature',
                              chunks={'lat': 25, 'lon': 25, 'time': -1})
dash = Dashboard(ds)
dash.show()

The above code now runs without error (on clicking PLOT).

In master branch it gives TypeError: quantile does not work for arrays stored as dask arrays. Load the data via .compute() or .load() prior to calling this method.

martindurant commented 4 years ago

This is a terrible idea! The whole idea of a dask-based xarray is that it is too big to fit into memory, and maybe needs to be processed on a cluster. .compute() brings the whole thing into memory. So it will be ok for the case that you are using only one slice, but not for the whole array. For the latter case at least, I would certainly use min/max instead of percentiles.

(note that the code should be tested anyway - I'm actually not sure whether .compute(), the dask method, is the right thing here, instead of xarray's .values).

hdsingh commented 4 years ago

I will go through the dask docs before attempting to solve this again, to get better understanding.

martindurant commented 4 years ago

OK, but ask questions sooner rather than later - dask is a pretty big project, and you are looking specifically at the xarray interface, which hides/augments some of the dask.array functionality.

hdsingh commented 4 years ago

Use of method=tdigest in dask.array.percentile would require crick and cython as dependency. Shall it be used?

martindurant commented 4 years ago

OK, we can do this for now. Maybe it will change later.

Note that we can test for the existence of crick.TDIgest and use it, if possible. We should not require it. Also, crick does not depend on cython, only numpy ( https://github.com/conda-forge/crick-feedstock/blob/master/recipe/meta.yaml#L26 ). Cython is used during building within conda-forge (or pip, if you install from source). If you do not know, I can give a brief introduction into what cython does at our next meeting.

Please test and use crick if possible, and then we can merge this.

I would like you at some point to justify the rounding to 5 places.

hdsingh commented 4 years ago

I have made relevant changes. Please have a look.

martindurant commented 4 years ago

OK, going in when it turns green.