NCAR / geocat-comp

GeoCAT-comp provides implementations of computational functions for operating on geosciences data. Many of these functions originated in NCL and were translated into Python.
https://geocat-comp.readthedocs.io
Apache License 2.0
129 stars 56 forks source link

[💡]: Xarray Accessors #641

Closed philipc2 closed 2 months ago

philipc2 commented 2 months ago

Describe the functionality you are requesting, linking any relevant usage or previous implementations in the python ecosystem or otherwise.

The majority of functions provided by geocat-comp support xr.Dataset or xr.DataArray objects as inputs. While having a procedural approach is familiar for users coming from ncl, it may be worth considering also supporting accessors directly linked to Xarray objects.

xcDat does this pretty elegantly (see here)

hvPlot attaches it's accessors to different objects through important specified modules

# attaches the `hvplot` accessor to xarray objects

import hvplot.xarray

xrda.hvplot()

For geocat-comp, an hvplot inspired approach could look something like the following:

import xarray as xr

# this would attach "geocat" objects to xarray data structures 
import geocat.comp.xarray

xrds = ...

# one possible syntax
xrds['t2m'].geocat.gradient()

# another possible syntax 
xrds['t2m'].geocat.comp.gradient()

# equivalent to 
gc.gradient(xrds['t2m'])

Is this a request for functionality that was previously in NCL?

No

Additional context

No response

anissa111 commented 2 months ago

Thanks for the thought, Philip!

Could you explain what benefits we'd get by adding xarray accessors?

philipc2 commented 2 months ago

Thanks for the thought, Philip!

Could you explain what benefits we'd get by adding xarray accessors?

Here's a blurb from Xarray's documentation on accessors:

The intent here is that libraries that extend xarray could add such an accessor to implement subclass specific functionality rather than using actual subclasses or patching in a large number of domain specific methods. For further reading on ways to write new accessors and the philosophy behind the approach

While geocat-comp isn't intend ended as a xarray extension, having accessors as an option for users could make it more intuitive to call methods and allow for better function chaining with Xarray methods.

# assuming a '.geocat'`  accessor 
import geocat.comp.xarray 
import geocat.comp as gc 

xrds = ...

# with accessor 
xrds['var'].geocat.gradient().mean().plot()

# without accessor 
gc.gradient(xrds['var'].gradient()).mean().plot()

An analogue would be the way NumPy handles a lot of functionality. You can call np.mean(arr) or arr.mean(), with both being valid approaches.

anissa111 commented 2 months ago

An analogue would be the way NumPy handles a lot of functionality. You can call np.mean(arr) or arr.mean(), with both being valid approaches.

I think that this is meaningfully different than geocat-comp functionality, because numpy provides both objects (numpy arrays) and functions that act on those objects, while geocat-comp does not provide a data object, just functions that act on data objects.

While users could choose to use either method if we did implement something like you're describing, I'm not seeing any additional actual benefit that this would provide other than an additional way to call our functionality while adding additional code to maintain and a need to create and maintain additional documentation for us.

philipc2 commented 2 months ago

An analogue would be the way NumPy handles a lot of functionality. You can call np.mean(arr) or arr.mean(), with both being valid approaches.

I think that this is meaningfully different than geocat-comp functionality, because numpy provides both objects (numpy arrays) and functions that act on those objects, while geocat-comp does not provide a data object, just functions that act on data objects.

While users could choose to use either method if we did implement something like you're describing, I'm not seeing any additional actual benefit that this would provide other than an additional way to call our functionality while adding additional code to maintain and a need to create and maintain additional documentation for us.

That's understandable. Feel free to close this issue if this isn't something of interest.

erogluorhan commented 2 months ago

This is exactly what I have been thinking of for a while! A number of benefits I can immediately think of with it: