Closed JoerivanEngelen closed 5 months ago
To quickly solve this now, maybe:
def __eq__(self, other) -> bool:
if other is self:
return True
elif not instance(other, type(self)):
return False
else:
xr_self = self.to_dataset()
xr_other = other.to_dataset()
return xr_self.identical(xr_other)
This is arguably too strict (because it doesn't identify optional attributes). But better to be too strict than too accepting.
More breadcrumbs:
Being not strict potentially creates quite some headache as well, because it essentially requires an align option.
Xarray perhaps takes care of this, since we could do an outer join on the objects. However, this would then introduce a new grid object, which cannot be instantiated directly, since from_dataset
only does limited initialization of optional attributes.
Some other thoughts I had: the to_dataset
doesn't actually include all optional attributes. So it's only somewhat strict and actually quite a nice way of testing (the to_dataset decides what's essential info and what isn't). It's potentially only a bit strict on attributes; we could skip checking those via the standard xarray .equals()
method and add some code to specifically check those ourselves separately.
Regardless, almost anything is an improvement over the current implementation...
Update:
@Huite and I discussed that __eq__
could potentially behave similar as the PandasIndex
: returning an array of bools, but we are undecided if this behavior is warranted. More useful is a is_equivalent
method which tests if differently ordered topologies are equal.
The code suggested here is implemented as .equals
:
To quickly solve this now, maybe:
def __eq__(self, other) -> bool: if other is self: return True elif not instance(other, type(self)): return False else: xr_self = self.to_dataset() xr_other = other.to_dataset() return xr_self.identical(xr_other)
This is arguably too strict (because it doesn't identify optional attributes). But better to be too strict than too accepting.
Closing this issue for now. Reopen if there is a demand for __eq__
.
By default Python checks for object identity when
__eq__
is not implemented. This means that grid objects are only considered equal if they refer to the same object exactly. This results in problems if you instantiate the same grid twice, and then concatenate the result; the resulting UgridDataset then has two separate topologies with the same name. Furthermore, it's currently possible to join two different topologies in a UgridDataArray if the dimensions happen to match exactly.Moreover,
.equals
now only tests if grids share the same type, so is too permissive.Ideally, we match the behavior of xarray as much as possible. In xarray terms, the Ugrid topology is an Index. So the equals behavior should match the equals behavior of xarray Indexes. These are mostly pandas Indexes: the equals check looks like:
See: https://github.com/pydata/xarray/blob/e22b47511f4188e2203c5753de4a0a36094c2e83/xarray/core/indexes.py#L794C1-L797C72
The challenge is that the Ugrid topology object is more complicated, it has:
A reasonable assumption seems:
E.g. if derived coordinates such as centroid_coordinates are slightly dissimilar, the equals check should fail -- since that's what xarray does as well.
An overly strict implementation could be very short: simply convert to xarray via
.to_dataset()
and use xarray to compare for equality. This may be possible, depending on options available in the xarray.equals()
methods.