xarray-contrib / cf-xarray

an accessor for xarray objects that interprets CF attributes
https://cf-xarray.readthedocs.io/
Apache License 2.0
152 stars 39 forks source link

Assign new variables by standard names #516

Open juseg opened 3 weeks ago

juseg commented 3 weeks ago

This pull request implements ds.cf.assign(**standard_variables). The new method should:

Also needed:

I would welcome opinions on the two points below.

1. Decide about the variable short name

I propose this algorithm:

ds.cf.assign(air_temperature=DataArray(name='tas')) -> 'tas'
ds.cf.assign(air_temperature=DataArray()) -> 'air_temperature'
ds.cf.assign(
   air_temperature=DataArray(name=tas)).cf.assign(
   total_precipitation=DataArray(name=tas)) -> tas, tas_
ds.cf.assign(
   air_temperature=DataArray(name=tas),
   total_precipitation=DataArray(name=tas)) -> tas, tas_

2. When dataset already contains standard name

ds = xr.Dataset()
ds = ds.cf.assign(air_temperature=0)
ds = ds.cf.assign(air_temperature=1)

I find it more difficult to decide what the method should do here.

Note: if we allow multiple variables with the same standard name, the resulting Dataset is technically valid, and ds.cf shows several variables associated with one standard name, while ds.cf[standard_name] fails with aKeyError`.

dcherian commented 2 weeks ago

I think the cf-xarray version of this only really makes sense when the assigned name is a standard name on one of the present variables, so (2) in your listing.

For (1), we should just forward on to Xarray, as usual.

if we allow multiple variables with the same standard name, the resulting Dataset is technically valid, and ds.cf shows several variables associated with one standard name, while ds.cf[standard_name] fails with a KeyError`

Yes, this is intentional. ds.cf[standard_name] will raise an error unless there is only one result, since that is the only way to return a DataArray. to get all, use ds.cf[[standard_name]]. Then you will get a dataset with all dataarrays with that standard name.