xcube-dev / xcube

xcube is a Python package for generating and exploiting data cubes powered by xarray, dask, and zarr.
https://xcube.readthedocs.io/
MIT License
199 stars 19 forks source link

where(cond, x, y) so that x or y chunks are not loaded if cond chunk is all zero or one #390

Open forman opened 3 years ago

forman commented 3 years ago

Is your feature request related to a problem? Please describe.

The xarray.where(cond, x, y) function will always load xand y chunks regardless of the values in a given cond. (Assuming that the sizes and chunks in cond, x, and y are all the same.)

If data cubes are opened via special chunk stores backed by some data API (such as the SentinelHubChunkStore in xcube-sh), we want to avoid fetching x or y chunks if a related cond chunk consists entirely of zeros or ones.

This can drastically reduce processing time and number of API requests, for example when in EO data only land surfaces are required and data comprises mostly water or clouds. In the case of Sentinel Hub it will also reduce the costs for users.

Describe the solution you'd like

A special xcube.where(cond, x, y) so that x or y chunks are not loaded if a related cond chunk is all zero or one.

Describe alternatives you've considered

May also post an xarray feature request.

Additional context

See class SentinelHubChunkStore in chunkstore.py in xcube-sh.

Related DCFS issue: https://gitext.sinergise.com/dcfs/common/-/issues/245

forman commented 3 years ago

Started https://github.com/dcs4cop/xcube/tree/forman-390-where