SciTools / iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
https://scitools-iris.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
635 stars 283 forks source link

Allow global attributes on a Mesh #6085

Open jrackham-mo opened 2 months ago

jrackham-mo commented 2 months ago

✨ Feature Request

Following the introduction of split local/global attribute handling in Iris 3.8 (PR https://github.com/SciTools/iris/pull/5152), it would be useful to have the ability to set global attributes on a Mesh. Setting an attribute on a mesh currently sets a local attribute on the mesh topology variable. Global attributes that do exist on the file (for example added via the python netcdf4 library, or the Conventions attribute that iris always adds on save) are lost on load.

Motivation

The motivation is to be able to add attributes such as history, provenance, etc. which make sense at the file level (i.e. global) rather than the variable level. Especially in the case where multiple meshes exist in a single file.

Additional context

Example current behaviour (iris 3.9):

>>> from iris.experimental.ugrid import save_mesh
>>> from iris.tests.stock.mesh import sample_mesh
>>> mesh = sample_mesh()
>>> mesh.var_name = "dynamics"
>>> mesh.attributes["history"] = "timestamp"
>>> print(mesh)
Mesh : 'dynamics'
    topology_dimension: 2
    node
        node_dimension: 'Mesh2d_node'
        node coordinates
            <AuxCoord: longitude / (degrees_east)  [...]  shape(15,)>
            <AuxCoord: latitude / (unknown)  [...]  shape(15,)>
    edge
        edge_dimension: 'Mesh2d_edge'
        edge_node_connectivity: <Connectivity: unknown / (unknown)  [...]  shape(5, 2)>
        edge coordinates
            <AuxCoord: longitude / (unknown)  [...]  shape(5,)>
            <AuxCoord: latitude / (unknown)  [...]  shape(5,)>
    face
        face_dimension: 'Mesh2d_face'
        face_node_connectivity: <Connectivity: unknown / (unknown)  [...]  shape(3, 4)>
        face coordinates
            <AuxCoord: longitude / (unknown)  [...]  shape(3,)>
            <AuxCoord: latitude / (unknown)  [...]  shape(3,)>
    var_name: 'dynamics'
    attributes:
        history  'timestamp'
>>> save_mesh(mesh, "dynamics_mesh.nc")

Inspecting the saved netCDF file:

$ ncdump -h dynamics_mesh.nc
netcdf dynamics_mesh {
dimensions:
    Mesh2d_node = 15 ;
    Mesh2d_edge = 5 ;
    Mesh2d_face = 3 ;
    dynamics_face_N_nodes = 4 ;
    dynamics_edge_N_nodes = 2 ;
variables:
    int dynamics ;
        dynamics:cf_role = "mesh_topology" ;
        dynamics:topology_dimension = 2 ;
        dynamics:history = "timestamp" ;
        dynamics:node_coordinates = "var_name latitude" ;
        dynamics:edge_coordinates = "longitude latitude_0" ;
        dynamics:face_coordinates = "longitude_0 latitude_1" ;
        dynamics:face_node_connectivity = "mesh2d_face" ;
        dynamics:edge_node_connectivity = "mesh2d_edge" ;
    int64 var_name(Mesh2d_node) ;
        var_name:units = "degrees_east" ;
        var_name:standard_name = "longitude" ;
        var_name:long_name = "long-name" ;
        var_name:a = 1LL ;
        var_name:b = "c" ;
    int64 latitude(Mesh2d_node) ;
        latitude:standard_name = "latitude" ;
    int64 longitude(Mesh2d_edge) ;
        longitude:standard_name = "longitude" ;
    int64 latitude_0(Mesh2d_edge) ;
        latitude_0:standard_name = "latitude" ;
    int64 longitude_0(Mesh2d_face) ;
        longitude_0:standard_name = "longitude" ;
    int64 latitude_1(Mesh2d_face) ;
        latitude_1:standard_name = "latitude" ;
    int64 mesh2d_face(Mesh2d_face, dynamics_face_N_nodes) ;
        mesh2d_face:cf_role = "face_node_connectivity" ;
        mesh2d_face:start_index = 0LL ;
    int64 mesh2d_edge(Mesh2d_edge, dynamics_edge_N_nodes) ;
        mesh2d_edge:cf_role = "edge_node_connectivity" ;
        mesh2d_edge:start_index = 0LL ;

// global attributes:
        :Conventions = "CF-1.7" ;
}

Desired output:

$ ncdump -h dynamics_mesh.nc
netcdf dynamics_mesh {
dimensions:
    Mesh2d_node = 15 ;
    Mesh2d_edge = 5 ;
    Mesh2d_face = 3 ;
    dynamics_face_N_nodes = 4 ;
    dynamics_edge_N_nodes = 2 ;
variables:
    int dynamics ;
        dynamics:cf_role = "mesh_topology" ;
        dynamics:topology_dimension = 2 ;
        dynamics:node_coordinates = "var_name latitude" ;
        dynamics:edge_coordinates = "longitude latitude_0" ;
        dynamics:face_coordinates = "longitude_0 latitude_1" ;
        dynamics:face_node_connectivity = "mesh2d_face" ;
        dynamics:edge_node_connectivity = "mesh2d_edge" ;
    int64 var_name(Mesh2d_node) ;
        var_name:units = "degrees_east" ;
        var_name:standard_name = "longitude" ;
        var_name:long_name = "long-name" ;
        var_name:a = 1LL ;
        var_name:b = "c" ;
    int64 latitude(Mesh2d_node) ;
        latitude:standard_name = "latitude" ;
    int64 longitude(Mesh2d_edge) ;
        longitude:standard_name = "longitude" ;
    int64 latitude_0(Mesh2d_edge) ;
        latitude_0:standard_name = "latitude" ;
    int64 longitude_0(Mesh2d_face) ;
        longitude_0:standard_name = "longitude" ;
    int64 latitude_1(Mesh2d_face) ;
        latitude_1:standard_name = "latitude" ;
    int64 mesh2d_face(Mesh2d_face, dynamics_face_N_nodes) ;
        mesh2d_face:cf_role = "face_node_connectivity" ;
        mesh2d_face:start_index = 0LL ;
    int64 mesh2d_edge(Mesh2d_edge, dynamics_edge_N_nodes) ;
        mesh2d_edge:cf_role = "edge_node_connectivity" ;
        mesh2d_edge:start_index = 0LL ;

// global attributes:
        :Conventions = "CF-1.7" ;
                :history = "timestamp" ;
}

Interestingly, a global Conventions attribute is already added on save.

pp-mo commented 2 months ago

Thanks for raising this @jrackham-mo and stating the case clearly.

Unfortunately I'm not going to be around for the next 2 weeks. But we had an offline discussion on this, so I'll try to record some of that (see below). I'd hope this can be discussed at the Peloton meeting next week. (10pm Weds). If interested, you could attend to discuss -- probably ask @HGWright who is running next week I think.

pp-mo commented 2 months ago

Regarding the "Conventions" attribute, it's hard-wired in our save code at present, though you can "monkey patch" what it will write by assigning iris.fileformats.netcdf.saver.CF_CONVENTIONS_VERSION. (or post-modify with ncdata, of course, which allows to pretty much any way to modify netcdf save output)

Personally I would be open to discussion on how to better manage this in Iris, as it really could do with more flexibility. I think the ESMValTool devs are interested in that too (e.g. https://github.com/SciTools/iris/issues/5255).

pp-mo commented 2 months ago

So, regarding the specific proposal, my initial feeling is that I'm reluctant to supporting global attributes on a mesh. But I'm wondering (hoping) maybe that is proposing a solution in advance of stating the requirement...

So, I think the requirement is to handle global (i.e. file-level) attributes when reading and writing meshes to files.
I think that could be better supported by adding keywords to the load_mesh/load_meshes/save_mesh functions.

Why not global attributes on a Mesh? To have global attributes on meshes would mean the Mesh class has a split-attribute dictionary, i.e. a CubeAttrsDict instead of a LimitedAttributeDict. This is obviously possible (though calling it CubeAttrsDict is then unfortunate!). But we would need to be careful about what happens if the multiple saved global attributes disagree, in which case I guess we should 'demote' them to local attributes as we do for cubes. I'm also concerned about what we should do when saving cubes with meshes, if the meshes can also contain global attributes. So just conceptually, I think this could be a bit of a can of worms. I also think there's a good reason we didn't extend iris.load/iris.save to take Meshes as well as cubes : Meshes, as Iris sees it, are essentially subsidiary content to cubes -- so is everything. That is "the CF way", however peculiar it often seems. ( TBH if Iris had a proper Dataset concept, it would be so much clearer ! but it is "less CF" )

Whereas, if we just support loading and saving meshes with a keyword to manage global-attributes in a dictionary, I think that would be clear and simple. So I would suggest usages like :

( In the second case, my suggestion is that a passed dictionary receives the attributes.
A return_attrs=False key would also work, but I personally don't much like variable return types. )

@jrackham-mo would this approach meet your needs, and if not what does this miss ?