iho-ohi / S-102-Product-Specification

It is opened to develop S-102 Bathymetric Surface Product Specification. The contents of this repository are not offical publication in force, therefore please check the final version on the IHO website.
Other
28 stars 11 forks source link

[PT11 Action XX] - Analysis of "uncertainty" for data reduction #8

Closed RohdeBSH closed 7 months ago

RohdeBSH commented 1 year ago

Ways to deal with "uncertainty" differently will be investigated. The primary goal is to reduce the file size.

RohdeBSH commented 1 year ago

We took a closer look at the implementation for uncertainty. The goal was to reduce the memory requirements for the uncertainty. Special attention was paid to the specification of default values for the entire uncertainty.

We tried different ways of separating the depth values and the uncertainty, for example by strictly separating the coverages. Furthermore, we have also tested different settings for compression.

The best result is obtained when the uncertainty is omitted. So the uncertainty should truly be optional. If the uncertainty is not needed, it should also be removed from the dataset. So from the bathymetry coverage and the group F.

@AnnaWall01 can you please take this to the agenda for PT13, thanks.

AnnaWall01 commented 1 year ago

@RohdeBSH Noted!

giumas commented 1 year ago

As commented during PT13, we need a mandatory way to provide quality for S-102. If you cannot assess it, then you put unassessed as for CATZOC in ENCs. Something that can be used for interoperability and UKC.

hasel001 commented 1 year ago

During PT13, we have agreed in principle to make uncertainty fully optional by no longer requiring an "empty" dataset to be populated (but with the fill value). @RohdeBSH, etc., have for action to propose the particular changes. After PT review, we will submit the proposed changes to the S-100WG to ensure any impacts to interoperability are appropriately handled.

RohdeBSH commented 1 year ago

Hi @giumas,

as I have already tried to explain in PT13. This issue is not about changing something fundamental about the concept of uncertainties. This is only about making the current procedure for transmitting the uncertainty more efficient. Only S-102 datasets that do not provide uncertainty are affected by this efficiency adjustment. In other words, data sets that provide a complete uncertainty coverage with filling values. The issue you raised may certainly be correct, but it is not part of this thread. The discussion about the missing mandatory quality specification should be conducted separately. Could you please open a new issue for this.

giumas commented 1 year ago

Hi @RohdeBSH,

Although I may understand the "out of sight, out of mind" principle that we are pursuing here, I have concerns that we are going in the wrong direction by totally removing the uncertainty layer without concurrently introducing another mandatory mechanism to provide a quality indicator for each cell node.

PS $1.1 states that "Incorporating aspects of the navigation surface concept [Smith et al, 2002], an S-102 bathymetric surface product is a digital elevation model which represents the seafloor in a regular grid structure.". From the abstract of the cited paper: "The model - called a 'navigation surface' - consists of a high-resolution bathymetric grid with an uncertainty value assigned to each node on the grid." and, later, "For each node an uncertainty value is computed which becomes an integral part of the model. The distribution of the points around the mean is combined with the predicted uncertainty of each measurement to form an overall uncertainty model. For low-density single-beam and lead-line surveys, the area between measurements is modelled based on a triangular irregular network (TIN). The uncertainty model then incorporates the distance from the measurement, as well as the uncertainty of the measurement itself.". Based on the above, we should agree that having an uncertainty value associated with each node is a core element of calling the DTM a navigation surface. It is not just an aspect that we can decide to ignore and still having a direct reference to the navigation surface concept.

In my opinion, we have 3 options here:

  1. Keep the uncertainty layer, even if with only fill values. This translates to "I have no idea of what the uncertainty is for this depth". If we remove it, the message is less clear. The HO may have an idea about the associated uncertainty, but it was decided to not populate this field. If a CATZOC is associated, it makes things even less clear. How was then the uncertainty component estimated to assign the CATZOC?
  2. Enforce that at least one between CATZOC and uncertainty layer is populated. If the CATZOC is populated, then a 'worst-case scenario' uncertainty can be estimated for each depth.
  3. Remove any reference to the navigation surface concept. This option would still leave S-102 without a mandatory quality indicator. There will be no way to distinguish depths of different quality.

I have a slight preference for option 2.

Hope that this post clarifies my concerns related to this proposal.

hasel001 commented 1 year ago

@CHS-LynnPatterson relayed the following information from Hannu Pippeon et al. at the June 2023 S-101PT Meeting:

In speaking with Hannu Pippeon and crew here, it is an IMO mandatory element that “Uncertainty” be recorded in all navigational datasets for all navigational product specs. Now of course it is 100% up to us how we encode that. If we are making it “optional” in the data at the node level. It seems we need to make it mandatory in the metadata at the dataset level, in order to be in compliance with that directive.

poseiron01 commented 1 year ago

Hi yall! As you can see above, Sweden's representative in the S-98 has initiated an issue in the S-98 GitHub repository based on the ongoing discussion we have here and on what Lynn relayed from Hannu et al. above. It also reflects Sweden's standpoint on this issue/subject in S-102PT. We must have a way to encode uncertainty in our product and make it mandatory. We agree with @giumas that there can be different ways to solve the issue and it's the PTs job to decide how to encode it. However this must be done in close connection with S-98. Have a great summer!

RohdeBSH commented 1 year ago

Hello Everyone,

I simply do not understand the discussion. What we are proposing here has absolutely nothing to do with the information being transported. It is not about what is transported, only about how. It is a purely technical proposal related to the HDF5 format. There is no impact on S-98. It is only about storing and transporting the same information more efficiently.

Nobody wants to remove uncertainty from the data. This is not even up for discussion, which is why I can't understand the objections/concerns at all.

The uncertainty is specified in several places in the data set (Ed2.1.0).

  1. as attributes at the BathymetryCoverage group (S-102 Ed2.1.0 Section 10.2.3)

    • /BathymetryCoverage/horizontalPositionUncertainty
    • /BathymetryCoverage/verticalUncertainty
    • grafik
  2. as attributes at the Group001 group (S-102 Ed2.1.0 Section 10.2.5)_

    • /BathymetryCoverage/BathymetryCoverage.01/Group_001/maximumUncertainty
    • /BathymetryCoverage/BathymetryCoverage.01/Group_001/minimumUncertainty
    • grafik
  3. at each grid-node (S-102 Ed2.1.0 Section 10.2.2)

    • grafik

This proposal does not consider the situation where a different uncertainty applies to each grid-node. Because there, everything should remain as it is. We only consider the case when the same uncertainty applies to the entire coverage. That is, if it is a fixed value or the fill value.

The optimization is simply that with a fixed value for the uncertainty, it is not necessary to store it redundantly at each grid node. This is a pointless waste of resources.

Uncertainty is assigned three categories in the S-102.

  1. product uncertainty (S-102 Ed2.1.0 Section 10.2.3)
    • /BathymetryCoverage/horizontalPositionUncertainty
    • /BathymetryCoverage/verticalUncertainty
  2. coverage uncertainty (S-102 Ed2.1.0 Section 10.2.5)
    • /BathymetryCoverage/BathymetryCoverage.01/Group_001/maximumUncertainty
    • /BathymetryCoverage/BathymetryCoverage.01/Group_001/minimumUncertainty
  3. grid-node-uncertainty (S-102 Ed2.1.0 Section 10.2.2)

If the uncertainty at the individual grid-nodes does not differ, it is sufficient to specify the uncertainty at the coverage. This is already done implicitly. If /BathymetryCoverage/BathymetryCoverage.01/Group_001/maximumUncertainty and /BathymetryCoverage/BathymetryCoverage.01/Group_001/minimumUncertainty are identical, that means nothing else than the value is considered as uncertainty for all grid-nodes of the coverage.

The information about the uncertainty of each grid node is not lost, the information is just located somewhere else.

Therefore, our proposal is just to allow more efficient storage of the data.

We have prepared two example files. Hopefully, this will make it clearer. Both files use a uniform uncertainty value for each grid-node, in this case the fill value "1000000". Both files are created in version 2.1.0. However, one uses uncertainty optimization. What is immediately noticeable is the reduced file size. Without optimization: 12.709.760 bytes With optimization: 10.823.936 bytes Difference: 1.885.824 Byte => ~ 1,8 Mbyte

By means of optimization, the file size can be reduced by 14.8% in the case discussed here.

Testdata_Uncertainty.zip

poseiron01 commented 1 year ago

Hi all! I think there might have been a mix up of two different issues. We are not opposing what BSH is proposing making file managing more efficient, rather it’s connected to the absence of a mandatory quality/uncertainty indicator in the PS today. And whatever we land in, we can land in something supported by input from S-98. This is two separate issues and somewhere in the thread they got mixed up. We should create an separate issue for this and have the discussion there. Can one migrate items/posts from one issue to a new one?

giumas commented 1 year ago

@poseiron01, I believe that the two issues should be managed in conjunction because they have an overlap. If we make the uncertainty layer fully optional as proposed by @RohdeBSH, then the lack of a mandatory quality indicator for S-102 PS becomes even worse. The ideal solution should cover both issues, clarifying the relationship between the quality indicators in the BathymetryCoverage group and the uncertainty layer. (My understanding is that maximumUncertainty and minimumUncertainty are populated with the maximum and the minimum values in the uncertainty layer, so not really a third distinct category of uncertainty.)

hasel001 commented 1 year ago

I think I see some common ground in front of all of us regarding these issues.

While @bhell asserted his opinion that node-based uncertainty is essential, let's postpone that discussion for a moment.

Both @poseiron01 and @giumas assert their opinions that a mandatory quality/uncertainty indicator is needed for the S-102 Product Spec.

@RohdeBSH, would having such a mandatory indicator defeat your aim of file size reduction? (I don't think so, but I want to understand thoroughly before proceeding.)

RohdeBSH commented 7 months ago

Closed because it is completed.