quantile doesn't work for dask-powered xarrays

martindurant commented 5 years ago

The code currently uses quantiles to set the colour-map limits, unless explicitly given by the user. If the data has dask-arrays internally, however, then you just get an error. I suppose we should explicitly .compute() the first slice in this case, before attempting the quantile, and calculate min/max when doing the whole data-set.

martindurant commented 5 years ago

Note that da.percentile is implemented and, when using the tdigest approximate method, is pretty efficient. It only works 1d, though, but reshaping should be cheap in this case. @hdsingh , your thoughts here? This is critical, because it prevents visualisation of dask xarrays, i.e., anything bigger than memory.

hdsingh commented 5 years ago

Adding .compute() before quantile solves this issue. Please refer https://github.com/intake/xrviz/pull/37 .

hdsingh commented 5 years ago

I am trying to use da.percentile and reshape xarray but facing the following issue:

import xarray as xr
import dask

ds = xr.tutorial.open_dataset('air_temperature',
                              chunks={'lat': 25, 'lon': 25, 'time': 10})

dask.array.percentile(ds.air, 10) #Percentiles only implemented for 1-d arrays
dask.array.percentile(ds.lon, 10) #Percentiles only implemented for 1-d arrays
#Although lon is 1d

# convert n-d to 1-d
# Since `reshape` method is not present for xarray, I am using `stack`
stacked_air = ds.stack(air=('lat','lon','time'))
dask.array.percentile(stacked_air, 10) #'Dataset' object has no attribute 'ndim'

@martindurant Can you please help me figure out the correct way to find quantile and reshape xarrays?

hdsingh commented 5 years ago

Thanks! I got your reply on Dask gitter. dask.array.percentile(ds.air.data.ravel(), 10).compute()

intake / xrviz

quantile doesn't work for dask-powered xarrays #31