pp-mo / ncdata

Free and efficient exchange of data between netcdf files, Xarray and Iris
https://ncdata.readthedocs.io/en/latest/index.html
BSD 3-Clause "New" or "Revised" License
7 stars 2 forks source link

General data wrangling #57

Open pp-mo opened 5 months ago

pp-mo commented 5 months ago

Although strictly excluded as a goal for the initial release, I still think the 'secondary' usage of ncdata will be useful :

For this there real scope for some convenience and sugar. Some ideas :

Update:

v0.1.1 delivered most of this :


For instance, some actions I needed to adjust a given file output from xarray so that Iris can correctly interpret the coord-system ...

>>> ds = ncdata.netcdf4.from_nc4(filepath)
>>> ds.variables['x'].attributes['standard_name'] = NcAttribute('standard_name', 'projection_x_coordinate')
>>> ds.variables['y'].attributes['standard_name'] = NcAttribute('standard_name', 'projection_y_coordinate')
>>> ds.variables['x'].attributes['units'] = NcAttribute('units', 'm')
>>> ds.variables['y'].attributes['units'] = NcAttribute('units', 'm')
>>> del ds.variables['spatial_ref'].attributes['spatial_ref']
>>> del ds.variables['spatial_ref'].attributes['crs_wkt']
>>> del ds.variables['spatial_ref'].attributes['horizontal_datum_name'] 
>>> cube, = to_iris(ds)
>>> print(cube.coord_system)
<bound method Cube.coord_system of <iris 'Cube' of band_data / (unknown) (band: 5; projection_y_coordinate: 6400; projection_x_coordinate: 7600)>>
>>> print(cube.coord_system())
TransverseMercator(latitude_of_projection_origin=53.5, longitude_of_central_meridian=-8.0, false_easting=200000.0, false_northing=250000.0, scale_factor_at_central_meridian=1.000035, ellipsoid=GeogCS(semi_major_axis=6377340.189, semi_minor_axis=6356034.447938534))
>>> 

So, how about

ds.variables['x'].attributes.update(NameMap(
    NcAttribute,  # type of contents
    ('standard_name', 'projection_x_coordinate'),  # *args are init arglists
    (`units', 'm')
))
pp-mo commented 5 months ago

We could also maybe be strict about expected content, to avoid obvious problems ...

But, this approach involves plugging all loopholes for different means of putting things in a container, such as 'extend', 'update', etc. That is tricky to ensure if you provide a subclass of 'dict', since you need to be sure what list of operations needs to be modified. Meanwhile, it's easier to be sure of completeness if you subclass collections.MutableMapping (like iris CubeAttrsDict). But even then, the correctness + of the solution is not obvious -- and the result no longer satisfies isinstance(x, dict), and might need extra methods adding.

In any case, strictness + correctness is hard to maintain since the objects are designed for free use. For example, el.attributes['x'] = attr = NcAttribute('x', val), but then you can just attr.name = 'y'

In that view, it makes sense to make it easy to do things 'right', preserving the expected. By which logic, we should provide utilities such as :


conclusion :