SciTools / iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
https://scitools-iris.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
633 stars 283 forks source link

Unexpected behaviour when merging cubelist with constant NaN-valued scalar coord #4681

Open btrotta-bom opened 2 years ago

btrotta-bom commented 2 years ago

Merge seems to treat scalar coordinates differently when they have value NaN. I am trying to merge a list of cubes, each of which has a scalar coordinate with the same value. If this value is not NaN, the merge works as expected: the resulting cube has a scalar coordinate identical to the one in the input cubes. But if the value is NaN, this coordinate becomes 2-dimensional in the result.

Example below:

import iris
import iris.cube
import numpy as np
from iris.coords import DimCoord, AuxCoord

data = np.random.random((3, 3))
dim_coords_and_dims = [(DimCoord(np.arange(3), long_name="coord1"), 0)]

# aux_coord is constant non-nan value, merged cube contains aux_coord as a scalar coord as expected
aux_coord = AuxCoord([0], long_name="aux_coord")
cubelist = iris.cube.CubeList()
for i in range(2):
    for j in range(2):
        aux_coords_and_dims = [
            (DimCoord([i], long_name="merge_coord1"), None),
            (DimCoord([j], long_name="merge_coord2"), None),
            (aux_coord, None),
        ]
        c = iris.cube.Cube(
            data,
            dim_coords_and_dims=dim_coords_and_dims,
            aux_coords_and_dims=aux_coords_and_dims,
        )
        cubelist.append(c)

cube = cubelist.merge_cube()
print(cube)
print(cube.coord("aux_coord"))

# aux_coord is nan, merged cube contains aux_coord as a 2-dimensional auxiliary coord
aux_coord = iris.coords.AuxCoord([np.nan], long_name="aux_coord")
cubelist = iris.cube.CubeList()
for i in range(2):
    for j in range(2):
        aux_coords_and_dims = [
            (DimCoord([i], long_name="merge_coord1"), None),
            (DimCoord([j], long_name="merge_coord2"), None),
            (aux_coord, None),
        ]
        c = iris.cube.Cube(
            data,
            dim_coords_and_dims=dim_coords_and_dims,
            aux_coords_and_dims=aux_coords_and_dims,
        )
        cubelist.append(c)

cube = cubelist.merge_cube()
print(cube)
print(cube.coord("aux_coord"))
trexfeathers commented 2 years ago

We should consider duplicating the solution in #3283.

Is Iris' array_equal() still appropriate in the places where it is being used?

stephenworsley commented 2 years ago

While the problem is similar, I don't think it is due to a use of array_equal() in this case. It seems as though the problem here is due to comparisons between Cells defined here: https://github.com/SciTools/iris/blob/5eeb2c02215122c038c60e76b784cdb37f8a1d94/lib/iris/coords.py#L1349-L1373

If a Cell containing a NaN was equal to another Cell containing a NaN, I think this would solve our problem. It ought to be possible to do this without changing Cell equality behaviour, though that might involve a bit more of a rewrite of the merge code.

stephenworsley commented 2 years ago

Due to changes in Python 3.10, the approach I took to fix this no longer seems viable. Until we come up with a better solution, #4701 has been reverted so this issue is reopened.

trexfeathers commented 1 year ago

ℹ Feel free to ask @stephenworsley for info if you pick this up!

pp-mo commented 7 months ago

We think that #5713 should now have resolved this. Do you think that is the case, so can maybe close this @btrotta-bom ?