Range of data values, and how encoded values can map to units? (e.g. 16-bit PNG support)

jerstlouis commented 4 years ago

In our implementation, we currently support coverages encoded in 16-bit PNG, with the 16-bit range mapped linearly to the range of values for the coverage, similar to how it is done in the GeoPackage tile gridded coverage extension.

Two main points:

How could it be possible to desribe in the .../{collectionId} resource the range of the values that occur in the coverage? e.g. -11000..9000, or 1 to 27? As far as I understand the RangeType does not specify this, it only says a 32-bit float or an unsigned 8 bit integer is used.
How could a linear mapping be established from encoded values to units like meters, to use a format like PNG and maximize precision while minimizing storage size?

pebau commented 4 years ago

@jerstlouis hm, as CIS has shamelessly stolen the range type structure from SWE Common DataRecord I guess the best, authoritative answer would come from the SWE/SOS experts.

jerstlouis commented 4 years ago

@pebau From glancing at https://portal.opengeospatial.org/files/?artifact_id=55939 (200 pages!), DataRecord ... AbstractDataComponent...

I see a couple things... QuantityRange and CategoryRange...

The CategoryRange class to define ranges of ordered categorical quantities

which in turn seem to be defined in https://portal.ogc.org/files/?artifact_id=38475&version=2&format=pdf (another 161 pages).

The CategoryRange might work e.g. for land classification coverage with 24 categories.

I wonder whether the QuantityRange makes sense for the values you can actually find within this coverage, as opposed to the potential values you could measure anywhere...

Any SWE/SOS experts in particular we could ping on this, and could the OGC API - Coverages specs provide some guidance and examples?

As for the mapping to some values for a specific encoding, that would potentially only apply for specific encodings, so I wonder how we could address that (e.g. things might normally be stored in floating point values, but to encode as PNG, since it doesn't have floats some linear mapping is required). Some additional properties could be defined?

jerstlouis commented 4 years ago

GeoServer WCS 2 example:

<swe:Quantity>
<swe:description>RED_BAND</swe:description>
<swe:nilValues>
<swe:NilValues>
<swe:nilValue reason="http://www.opengis.net/def/nil/OGC/0/unknown">0.0</swe:nilValue>
</swe:NilValues>
</swe:nilValues>
<swe:uom code="W.m-2.Sr-1"/>
<swe:constraint>
<swe:AllowedValues>
<swe:interval>100.0 150.0</swe:interval>
</swe:AllowedValues>
</swe:constraint>
</swe:Quantity>
</swe:field>

cmheazel commented 4 years ago

Here is an example using ISO 19115-2. I believe the Range Type in CIS is similar. The coverage is a SAR collection. Each pixel is an imaginary number (real and imaginary components) represented as 2s compliment integers. "Count" is the SWE Common equivalent of integer. There is a lot of additional information you can add, but this was sufficient for my immediate needs.

I'm sure there is a range type you cannot represent using SWE Common, but I haven't found one yet.

`

Each component is stored in a 16-bit signed integer in 2’s complement format. 0 65535 Each component is stored in a 16-bit signed integer in 2’s complement format. 0 65535 `

joanma747 commented 4 years ago

In the past, in Coverage DWG, there was a proposal to create a best practice to show how to use rangetype and how to apply SWE common in different use cases. It never progressed due to lack of resource (time). IMHO there is still a need to do so. In my opinion WCS describe the domainSet in full detail but the rangeType is only found in a few examples that are not enough to cover all common use cases (as this discussion shows). It is still a bit mysterious.

cmheazel commented 4 years ago

Action for this Sprint - capture RangeType examples. I'll add them to the API-Coverages Users Guide (once I create it).

cmheazel commented 4 years ago

Regarding the lookup table approach -- In my experience the lookup table is separate from the pixels. So in @jerstlouis PNG example, the range type will always be 16 bit Integers. So an indicator that the pixel values are indexes into a lookup table, and a reference to the lookup table, would solve the problem. Rules for interpolation between lookup values would also help.

jerstlouis commented 4 years ago

@cmheazel to clarify my use case, it's not a look up table per se but a linear mapping... (defined by a factor and offset). Of course you could populate a look up table mapping the values, but that becomes impractical beyond 16-bit :)

I am really wondering where this linear mapping could be defined? It seems that the range type could address the minimum and maximum value found in the overall coverage.

But potentially the linear mapping might be specific to a particular encoding, e.g. what if the server decides to use 32-bit float when using GeoTIFF, but wants to use 16-bit integer for PNG (because it doesn't support floats)? I wonder how we could address this... unsignedIntegerLinearScale and unsignedIntegerLinearOffset properties? Also supported in GeoPackage gridded coverage extension is a per-tile linear mapping, which nicely maps the full 16-bit to the min/max of each tile, giving you very high precision and low space usage. That gets trickier as you would need to retrieve that information for each tile...

In our GNOSIS Map Tiles format we have that info directly in the tile format, with a couple doubles at the beginning of the data specifying the min & max, then 16-bit values mapping to that range (coverageQuantized16 gridded coverage type), which also is encoded with a Paeth filtering which compresses even better than PNG because PNG doesn't compress 16-bit values very well.

Schpidi commented 4 years ago

In the Earth Observation Application Profile version 1.1. we tried to support the domain specific requirements including a mapping from data type to data semantics. See <wcseo:dataType2dataSemantics> in the example in http://docs.opengeospatial.org/is/10-140r2/10-140r2.html#_range_type

We also started to work an a potential Best Practice document but this was never officially submitted to OGC I'm afraid. The relevant part is https://eox-a.github.io/eo-data-access-bp/#rangetype-description-enhancements with sources at https://github.com/EOX-A/eo-data-access-bp/blob/master/spec/clause_08_rangetype-description-enhancements.adoc

cmheazel commented 4 years ago

@jerstlouis The general problem is how to convert pixel data back into the original measurement. This could be a lookup table, a linear function, a polynomial function, or even a multi-step workflow. But none of that is part of the range type. It's what I've always known as exploitation support data. As such, it lives in the CIS Metadata sector, possibly as SensorML.

jerstlouis commented 4 years ago

@cmheazel right, a linear function to convert to the measurement is what I was hoping for. Burying this deep in SensorML inside CIS metadata is not going to provide the interoperability.

Something like this ( http://docs.opengeospatial.org/is/17-066r1/17-066r1.html#_using_the_scale_and_offset_values ) in GeoPackage gridded coverage extension is what I was hoping for, however I find that the reverse factor and offset (going in the opposite direction, i.e. what transformation was applied to the original measurement) would be a more natural way to express this.

Schpidi commented 4 years ago

Coverages SWG call: There should be clear guidance in the User Guide. OAPI-Coverages inherits this from CIS and thus from SWE Common. If further standardization is necessary this should be done in CIS.

jonblower commented 3 years ago

Just to add that I come across this use case a lot too, so it's good to see discussion of this. It's a very common way to compress data in NetCDF files, and it seems to be getting more common to encode data in image formats (even monochrome, at 1 byte per pixel). It would be nice to be able to achieve this without burying it too deep.

pebau commented 3 years ago

would be good to add as an extension spec to CIS 1.1, for schema compatibility the metadata slot is a good place - actually, it has been foreseen for extensions like this.

But I am still not sure that we should do our thing independently from others with similar tasks, such as SWE / SensorThings. @jerstlouis is the Hero of Harmonization, @kathischleidt has expressed interest - more would certainly jump on the bandwagon, shall we...?

opengeospatial / coverage-implementation-schema

Range of data values, and how encoded values can map to units? (e.g. 16-bit PNG support) #5