Operations should accept either a xarray.Dataset or a emsarray.formats.Format instance. Both of these types have enough information to access the other.
Rationale
Sometimes, the correct Format to use is difficult to autodetect, or developers want to customise the Format for a specific use case. The dataset.ems attribute does not allow developers to customise the Format being used, and does not allow assignment. Always passing a Dataset does not allow developers to use customised Format instances.
Implemntation
A Format holds a reference to its associated Dataset, and a Dataset has the dataset.ems attribute. A new utility function should be added:
from emsarray.formats import Format
from typing import Union, Tuple
import xarray as xr
DatasetOrFormat = Union[xr.Dataset, Format]
def dataset_and_format(df: DatasetOrFormat) -> Tuple[xarray.Dataset, Format]:
if isinstance(df, xr.Dataset):
return df, df.ems
if isinstance(df, Format):
return df.dataset, df
raise TypeError(f"Unknown argument type: {type(df)!r}")
Operations would be written as:
from emsarray.utils import DatasetOrFormat, dataset_and_format
def foo_operation(df: DatasetOrFormat, ...):
dataset, format = dataset_and_format(df)
...
Internally, any time an operation is called, we should pass the Format instance if one is already present (i.e. pass self instead of self.dataset from Format methods), to ensure custom Formats are respected.
This is no longer required. A specific Convention class can be bound to a dataset, overriding the autodetection that normally happens with dataset.ems.
Operations should accept either a
xarray.Dataset
or aemsarray.formats.Format
instance. Both of these types have enough information to access the other.Rationale
Sometimes, the correct Format to use is difficult to autodetect, or developers want to customise the Format for a specific use case. The
dataset.ems
attribute does not allow developers to customise the Format being used, and does not allow assignment. Always passing a Dataset does not allow developers to use customised Format instances.Implemntation
A Format holds a reference to its associated Dataset, and a Dataset has the
dataset.ems
attribute. A new utility function should be added:Operations would be written as:
Internally, any time an operation is called, we should pass the Format instance if one is already present (i.e. pass
self
instead ofself.dataset
from Format methods), to ensure custom Formats are respected.