Closed martindurant closed 5 years ago
Note that da.percentile is implemented and, when using the tdigest approximate method, is pretty efficient. It only works 1d, though, but reshaping should be cheap in this case. @hdsingh , your thoughts here? This is critical, because it prevents visualisation of dask xarrays, i.e., anything bigger than memory.
Adding .compute()
before quantile
solves this issue. Please refer https://github.com/intake/xrviz/pull/37 .
I am trying to use da.percentile and reshape xarray but facing the following issue:
import xarray as xr
import dask
ds = xr.tutorial.open_dataset('air_temperature',
chunks={'lat': 25, 'lon': 25, 'time': 10})
dask.array.percentile(ds.air, 10) #Percentiles only implemented for 1-d arrays
dask.array.percentile(ds.lon, 10) #Percentiles only implemented for 1-d arrays
#Although lon is 1d
# convert n-d to 1-d
# Since `reshape` method is not present for xarray, I am using `stack`
stacked_air = ds.stack(air=('lat','lon','time'))
dask.array.percentile(stacked_air, 10) #'Dataset' object has no attribute 'ndim'
@martindurant Can you please help me figure out the correct way to find quantile and reshape xarrays?
Thanks! I got your reply on Dask gitter.
dask.array.percentile(ds.air.data.ravel(), 10).compute()
The code currently uses quantiles to set the colour-map limits, unless explicitly given by the user. If the data has dask-arrays internally, however, then you just get an error. I suppose we should explicitly
.compute()
the first slice in this case, before attempting the quantile, and calculate min/max when doing the whole data-set.