xarray-contrib / cf-xarray

an accessor for xarray objects that interprets CF attributes
https://cf-xarray.readthedocs.io/
Apache License 2.0
152 stars 39 forks source link

Cache flag-related objects and refactor extraction of flags #520

Open Descanonge opened 1 week ago

Descanonge commented 1 week ago

The dictionary of flag values/masks is cached in the CFAccessor, moving the code of create_flag_dict into a property of the accessor.

The dataset of booleans masks is cached in the CFDataArrayAccessor. I don't know if this is a welcomed change. I would argue that if it is presented as a property (.flags), I would expect it to be the same instance on different calls, contrary to a method like .flags() or .get_flags(). But I'm not very familiar with the rest of the cf-xarray API. Anyway it can be reverted easily.

I have also refactored the construction of that flag dataset to make it (hopefully) more readable and understable. It makes use of the FlagParams named-tuple. The operations on the data are unchanged.

dcherian commented 1 week ago

I have also refactored the construction of that flag dataset to make it (hopefully) more readable and understable.

Nice!

Re: API if it is useful to get flag_dict then perhaps we should create a Flags dataclass with as_dict and as_dataset methods?

Descanonge commented 1 week ago

Re: API if it is useful to get flag_dict then perhaps we should create a Flags dataclass with as_dict and as_dataset methods?

I am not sure I'm following. Something like datarray.cf.flags.as_dict ? It would regroup the code for flags, which is nice. But as_dataset needs to check that the object is an array and not a Dataset (or maybe select a valid flag variable in the Dataset if there is only one). It could essentially do create_flag_dict on init, so that would be cached, and making the flag dataset available from a method seems reasonable.