opengeospatial / CRS-Gridded-Geodetic-data-eXchange-Format

Gridded Geodetic data eXchange Format
11 stars 3 forks source link

hierarchyRank redundant #24

Closed ccrook closed 2 years ago

ccrook commented 2 years ago

Currently the GGXF specification includes hierarchyRank in grid headers.

This information is redundant as the list of grids in groups or grids is an ordered list (in both YAML and NetCDF representations), so adding a hierarchyRank to define an order is unnecessary.

It adds a unnecessary extra step to software which then may have to reorder grids to reflect this.

However there may be a difference in the algorithm for searching for a grid works.

In a nested grid structure the algorithm can use the first grid (or subgrid) in a list that contains a point. With a heirarchy rank it would be using the last grid (assuming hierarchy is equivalent to list order)

A simple modification is to just use the nested grid algorithm (first grid containing a point is used, then search for child grids) and discard the hierarchy rank.

If a provider chooses not to use nesting then they could instead order the grid list with the preferred grids (ie finest detail) first.

This simplifies the algorithm and specification, while still retaining the capabilities of a hierarchical grid structure

RogerLott commented 2 years ago

@ccrook Do I understand that you are

ccrook commented 2 years ago

@RogerLott Yes - that is the suggestion. I am possibly relitigating, since I was arguing for retaining the hierarchical grid option. My thinking though is that now that the parent child relationship is coming from the structure of the file it makes sense to do the same for heirarchy. It isn't changing the capability, just how it is implemented. The main change this is that this is proposing that the order of grids within a list (root grids in a group or child grids of a parent) is ordered from most preferred to least.

ccrook commented 2 years ago

In an email @RogerLott questioned whether removing hierarchyRank will work in the NetCDF implementation. I realise I am not sure it will.

Within the NetCDF structure (as I have implemented it) both groups and grids (ie grid headers) are implemented as NetCDF groups. Each NetCDF group can contain other groups. (The grid data is a NetCDF variable within the NetCDF group that represents the grid). So the nesting structure is definitely supported by NetCDF

It is not entirely clear to reading the NetCDF docs that the contained NetCDF subgroups do have an explicit order. They are returned in the python API as an ordered list, and the order of the list seems to reflect the order in which the subgroups are created (though I haven't tested this thoroughly). So I am not 100% sure that NetCDF does guarantee the implicit ranking of the set of grids of a group, or subgrids or a grid. That may need some investigation or a query to the NetCDF maintainers.

Note that also issue #22 is specifically related to the structure and API offered by NetCDF.

ccrook commented 2 years ago

If we do need to retain hierarchyRank I suggest that it only provides a ranking within the either a list of grids in a group or subgrids of a grid, but not a ranking of all the grids in a group. That is, the ranking cannot contradict the explicit nesting structure in the YAML or NetCDF.

desruisseaux commented 2 years ago

According Stackoverflow, HDF5 groups are read in the same order than they were created if those groups are created with the H5P_CRT_ORDER_TRACKED | H5P_CRT_ORDER_INDEXED flags set. I guess this is the case by default in the Python library since Chris observed that the order is preserved.

So HDF5 has this capability (while optional) and I think we can assume that all major HDF5 libraries support this optional feature.

ccrook commented 2 years ago

Thanks @desruisseaux.

Although I have observed that order is preserved this might be just luck. So at some point I should probably investigate whether this is actually enforced in the python library (or possibly in the NetCDF4 implementation of HDF5 - I'm not clear how the APIs are connected).

There are very few examples where this will actually make a difference (ie where we are using overlapping grids) so I could imagine this being a bug hiding for a long time waiting to bite!

ccrook commented 2 years ago

The structured approach is now embedded in GGXF defintion