Open aulemahal opened 4 years ago
Hi @aulemahal, that sounds like a really nice approach. I didn't realise xarray
had a formal way of doing this. I will discuss with colleagues and get back to you. Thanks
@agstephens It would be great to address this issue in our upcoming meeting. I'd love to see this as an optional way of calling clisops
, maybe with a few goodies enabled?
Yes @Zeitsperre, let's talk about this. It seems straightforward.
This would, indeed, be easy to create:
import clisops
import clisops.core.subset
import xarray as xr
import os
@xr.register_dataset_accessor("cso")
class ClisopsCoreWrapper(object):
def __init__(self, xarray_obj):
self._obj = xarray_obj
@property
def version(self):
return clisops.__version__
def subset_time(self, *args, **kwargs):
return clisops.core.subset.subset_time(self._obj, *args, **kwargs)
def test_cso():
dr = '/badc/cmip6/data/CMIP6/ScenarioMIP/DKRZ/MPI-ESM1-2-HR/ssp126/r1i1p1f1/Amon/tas/gn/v20190710'
fpath = os.path.join(dr, os.listdir(dr)[0])
ds = xr.open_dataset(fpath)
print(ds.cso.version)
print(ds.cso.subset_time(start_date='2016-01-16', end_date='2016-12-16'))
test_cso()
This could be automatically maintained by doing picking up the external functions list from relevant modules and creating lambdas for each:
import xarray as xr
@xr.register_dataset_accessor("cso")
class ClisopsCoreWrapper(object):
def __init__(self, xarray_obj):
self._obj = xarray_obj
for funcname in clisops.core.subset.__all__:
func = getattr(clisops.core.subset, funcname)
setattr(self, funcname, (lambda *args, **kwargs: func(self._obj, *args, **kwargs)))
@property
def version(self):
return clisops.__version__
So it looks very easy to do. The question is: should this be the public API that we expose?
Any thoughts: @huard @Zeitsperre @cehbrecht @ellesmith88 ?
I don't have a strong opinion either way. I certainly think it's worth experimenting with.
I realize that this is still in the planning stage, but it looks like rioxarray
is slated to become a back-end engine for xarray. Something to keep an eye on: https://github.com/pydata/xarray/issues/4697
Packages like
rioxarray
orhvplot
, provide an xarray extension so their methods can be called directly on the dataset. Would that be wanted withclisops
? Example: instead ofone could use:
Where "cso" is the xarray extension added by
clisops
.Personally, I like this approach as it looks more elegant and xarray-esque. Moreover, it could allow for dataset-related lookups like crs info in metadata or using something like rioxarray's
ds.rio.set_spatial_dims
to solve the problem of #32. Implementation-wise, it shouldn't be complicated and wouldn't change the rest of the api, simply add another access mechanism. And, I believe it would make clisops more attractive to xarray users!As a heavy user of almost-extinct
xclim.subset
, I can offer some time on this implementation, it it is wanted.