Performing metadata equality via different methods provides different results in Iris v3.0.1

ehogan commented 2 years ago

🐛 Bug Report

I have two cubes. When I use cube1.metadata.difference(cube2.metadata) my metadata shows differences, even though the values look the same. However, when I use assert cube1.attributes == cube2.attributes I get no AssertionError.

I discovered this issue because:

with v2.4.0 (e.g. scitools/default-previous) I can run assert cube1.metadata == cube2.metadata and I get no AssertionError.
with v3.0.1 (e.g. scitools/default-current) I get an AssertionError when I run assert cube1.metadata == cube2.metadata.

How To Reproduce

% module load scitools
% python
>>> import iris
>>> import numpy as np
>>> cube1 = iris.cube.Cube(1, attributes={"value1": np.float64(89.5)})
>>> cube2 = iris.cube.Cube(1, attributes={"value1": 89.5})
>>> assert cube1.metadata == cube2.metadata
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError
>>> cube1.metadata.difference(cube2.metadata)
CubeMetadata(standard_name=None, long_name=None, var_name=None, units=None, attributes=({'value1': 89.5}, {'value1': 89.5}), cell_methods=None)
>>> assert cube1.attributes == cube2.attributes
>>>

Expected behaviour

I'm not sure what would be best?

Either cube1.metadata.difference(cube2.metadata) shows the types in the difference, so it's clear where the difference is (because the values look the same and this took longer than I care to admit to figure out) 😝

Or assert cube1.attributes == cube2.attributes fails

Or both?

bjlittle commented 2 years ago

@ehogan Under the hood for metadata attributes dictionary comparisons we use python-xxhash (see xxHash).

The fact that you can put scalar numpy arrays into the dictionary is the issue here. Although np.float(89.5) is equivalent to 89.5 they are not the same thing, and we've chosen to discriminate based on that.

This is why assert cube1.metadata == cube2.metadata fails, and the difference method is consistent with that behaviour and returns that there is indeed a difference between the attributes dictionaries.

Perhaps it might be more helpful to the user if the difference method highlighted this explicitly :thinking:

You might reasonably ask why oh why do we care about such a difference?

... one concrete example is for cube arithmetic when the metadata for two operand cubes are combined into one resultant cube. Given your example above, which version of attributes["value1"] should be used in the resultant cubes metadata? The decision is not clear, hence the difference does kinda matter.

Note that, cube1.attributes == cube2.attributes is a simple Python dictionary comparison... and in this case gives a different result, as the Python builtin comparison being performed is different.

I can't imagine that we'll subclass a Dict and specialise the __eq__ behaviour to align with metadata __eq__.

At the very least this behaviour should be highlighted and clarified... but do note that you are highlighting a special case here of using a numpy scalar array. Your example doesn't extend to non-scalar numpy arrays e.g.,

>>> import numpy as np
>>> a = dict(values=np.arange(3))
>>> b = dict(values=[0, 1, 2]) 
>>> a == b
a == b
Traceback (most recent call last):
  File "<input>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Also, note that metadata combine, compare and difference behaviours avoid this type of ValueError due to the approach that we've adopted for special treatment of dealing with attributes dictionaries.

bjlittle commented 2 years ago

@ehogan Note also that his change in behaviour was introduced into iris 3.0.0, and was one of the many reasons for a major release of iris i.e., it breaks backwards compatibility for some behaviours. Reference #3785

larsbarring commented 2 years ago

And here is view from someone that all to frequently (to say the least...) have to deal with all sorts of metadata (and data) imperfections. Model data, that is supposed to be one homogeneous set, that was produced in different batches, on different machines, quality controlled by different people, using different [versions of] software tools, and good know what is the reason behind the imperfections. In the best of worlds such problems should be non-existing or data just rejected/fixed by the producers. But I am afraid that I do not live in that world.

Last year we processed something like 10,000-15,000 files from a large online database, and something like 60% had metadata (or data) imperfections. Most of these were easy to spot and could be fixed with a few steps of preprocessing. Remained something like 500 files making up about 30-50 supposedly homogeneous dataset units. Homogeneous in the sense that my_cube_list.concatenate_cube() should go down just like a sip of coffee. As it was impossible to imagine where these imperfections where, in in which file, in which metadata attribute, or which file(s) had coordinates that differed in LSB, trial and error was the method. Not preferred but at hand. A lot of coffee were used before that smooth sip finally went down.

So, what conclusions to to draw from this with any bearings on iris?

Python itself does a very good job in working with different types, be it integers or float64. Iris could aim in the same direction. In the example above, which is the more general type np.float(89.5) or 89.5?
Define some general rules for how to deal with this, something like * In the case of differences in [meta]data types iris will always attempt to convert to an equivalent numpy type and issue a warning. This warning can be muted by ...." In this way the program will continue and either chew on the next metadata imperfection or just happily run to the end.
Of course there might be situations where such a general rule is of no help, and raising an error is the only way.

I think that steps in this direction will be immensely useful all and everyone that are dealing with data from various sources.

rcomer commented 2 years ago

@larsbarring have you looked at cube_helper? I've not used it myself but think it was written for CMIP ensembles. I think we're all agreed that Iris's own merge and concatenate functions should be more lenient though.

ehogan commented 2 years ago

Thanks @bjlittle :) I'm happy with the discrimination between np.float(89.5) and 89.5. I do think it would be super helpful if the output showed the types of the values, as it is not clear what the difference is in the output (the values look the same to me!). So instead of:

CubeMetadata(attributes=({'value1': 89.5}, {'value1': 89.5}))

could the output be:

CubeMetadata(attributes=({'value1': np.float64(89.5)}, {'value1': 89.5}))

or similar?

Also, would it be worth adding to the documentation that assert cube1.attributes == cube2.attributes doesn't distinguish between types in the same way as cube1.metadata.difference(cube2.metadata)? If this is something you think is worth doing I'd be happy to have a go at a PR? Or, alternatively, I'd be happy to review something? :)

github-actions[bot] commented 1 year ago

In order to maintain a backlog of relevant issues, we automatically label them as stale after 500 days of inactivity.

If this issue is still important to you, then please comment on this issue and the stale label will be removed.

Otherwise this issue will be automatically closed in 28 days time.

ehogan commented 1 year ago

If this issue is still important to you, then please comment on this issue and the stale label will be removed.

This issue is still important to me 😊

SciTools / iris