pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.6k stars 1.08k forks source link

N-D rolling #819

Closed forman closed 5 years ago

forman commented 8 years ago

Dear xarray Team,

We just discovered xarray and it seems to be a fantastic candidate to serve as a core library for our climate data toolbox we are about to implement. While investigating the API we recognized that the windows kwargs in

DataArray.rolling(min_periods=None, center=False, **windows)

is limited to a single dim=window_size entry. Are there any plans to make it rolling in N-D? This could be very useful for efficient gap filling, filtering or other methodologies that use grid cell neighbourhoods in multiple dimensions.

Actually, I also asked myself why the groupby and resample methods don't take an N-D dim argument. This would allow for performing not only a temporal resampling but also a spatial resampling in the lat/lon plane or even a spatio-temporal resampling (including up- and downsampling in either dim).

Anyway, thanks for xarray!

Regards Norman

jhamman commented 8 years ago

@forman -

The main reason this isn't supported yet is that we haven't implemented it yet. I recently added the Rolling object to xarray and our initial application aimed to wrap the bottleneck moving window functions. Because the bottleneck moving window functions only supports one windowing axis, we have also adopted that constraint for now.

At this point, I don't know of any plans to extend this functionality although we could discuss further. The complexity of N-D rolling aggregations does increase a fair bit and we would ideally like to let a lower level package (e.g. bottleneck) handle most of that. Since you seem to have a tangible application for the N-D rolling feature, maybe this is something you want to contribute to?

forman commented 8 years ago

Thanks for the prompt reply!

Once we have decided to use xarray for our project(s) and once we familiarized with its internals, we'll be happy to contribute and support you! Currently we all feel a bit dizzy about the many options we have and how to decide which way to go: Create our own library using xarray or build on UK MetOffice's Iris, Apache OCW, or Max-Planck-Institute's CDO, etc.

magonser commented 8 years ago

Hello together,

additional to forman's comments I would like to note that for the xarray rolling operation a stride would be useful in some use-cases. I have understood, that no work is currently planned, but wanted to leave this remark for further contributions.

I.e.

DataArray.rolling(min_periods=None, center=False, **windows)

with

windows={dimname: (size, stride=1)}

Thanks for xarray! It's great!

shoyer commented 8 years ago

@magonser We could add support for strides, but for builtin operations like min, max, sum and mean, the implementation could not be any more efficient than doing the rolling window calculation first and then using indexing to select out the strides afterwards -- we already use one a pass algorithm (via bottleneck) that uses each data point once.

stale[bot] commented 5 years ago

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically