RangeType / SWE Common Confusion

jerstlouis commented 4 years ago

OGC API - Coverages implementors are extremely confused as to how to populate the range type. CIS 1.2.3 says:

1.2.3 SWE Common The coverage RangeType component (see Clause 6) utilizes the SWE Common [4] DataRecord. Consequently, the semantics of sensor data acquired through SWE standards can be carried over into coverages without information loss. See also Annex D.4.

The definition of a DataRecord would be what does the value represent, NOT its data type. However most implementations of OGC API - Coverages currently fill definition with URIs intended for the SWE Common dataType which is found deep inside a BinaryEncoding element ( 8.6.1 BinaryEncoding Element of http://portal.opengeospatial.org/files/?artifact_id=41157 ). For example "ogcType:signedShort" or "ogcType:float32" Also it seems that full URLs should be used as URIs.

SWE Common standard is 193 pages and extremely difficult to follow.

There does not seem to be any example JSON encoding of a BinaryEncoding, and how exactly one would define that the data is encoded as a 32-bit float. Is this something which should be described in the RangeType, or should that be left to the binary encoding only?

Also, I wonder if in the case of multi-bands this is really supposed to be separate fields, or if it should be one field with components? The SWE Common specifications are really complex so it would be good to have someone who knows the standard better brief the SWG, which could then provide practical example JSON encodings of range type for the 95% common use cases in CIS/Coverages.

Schpidi commented 4 years ago

@ghobona could you maybe help us to approach SWE Common for help on implementing proper examples as suggested by @jerstlouis? Thanks

jerstlouis commented 4 years ago

In particular it was pointed out that the example http://schemas.opengis.net/cis/1.1/json/examples/10_2D_regular.json definition might not be right:

    "rangeType": {
        "@context": "http://localhost/json-ld/rangetype-context.json",
        "type": "DataRecordType",
        "id": "examples:CIS_RT_10_2D",
        "field":[{
            "type": "QuantityType",
            "id": "examples:CIS_RT_F_10_2D",
            "definition": "ogcType:unsignedInt",
            "uom": {
                "type": "UnitReference",
                "id": "examples:CIS_RT_F_UOM_10_2D",
                "code": "10^0"
            }
        }]
    }

ogcType:unsignedInt (or the proper URL URI http://www.opengis.net/def/dataType/OGC/0/unsignedInt) is intended for the DataType of BinaryEncoding, not for the definition of what this value represents, which should probably be something like .../elevation (e.g. Radiance in some valid examples).

jerstlouis commented 4 years ago

In SWE Common, http://www.opengis.net/def/dataType/OGC/0/unsignedInt is defined in Table 8.1 on page 112, "URI to use in “dataType” attribute".

If I understand it correctly, this is the relevant relationship I gather from the UML diagrams dispersed over the 193 pages of SWE Common specifiations:

https://gist.github.com/jerstlouis/5740f956898c268e8253b5008bb9723d

From this, the dataType is an attribute of a Component, inside a ComponentOrBlock's byComponent, inside a BinaryEncoding's member, as a way to define an AbstractEncoding, inside a DataArray or a DataStream, A DataArray would then be a valid field of a DataRecord, which is what CIS RangeType is defined as.

But in those CIS 1.1 examples, dataType is being used directly for the definition attribute of a Quantity, which is inherited from AbstractDataComponent via AbstractSimpleComponent.

My point is that http://www.opengis.net/def/dataType/OGC/0/unsignedInt is not a URI intended for that definition, it's strictly intended for binary encoding components data types.

The “definition” attribute identifies the property (often an observed property in our context) that the data component represents by using a scoped name. It should map to a controlled term defined in an (web accessible) dictionary, registry or ontology. Such terms provide the formal textual definition agreed upon by one or more communities, eventually illustrated by pictures and diagrams as well as additional semantic information such as relationships to units and other concepts, ontological mappings, etc.

Examples

The definition may indicate that the value represents an atmospheric temperature using a URN such as “urn:ogc:def:property:OGC::SamplingTime” referencing the complete definition in a register. The definition may also be a URL linking to a concept defined in an ontology such as “http//www.opengis.net/def/OGC/0/SamplingTime” The name could be “Sampling Time”, which allows quick identification by human data consumers. The description could be “Time at which the observation was made as measured by the on-board clock” which adds contextual details.

None of that suggests binary data types... But "The definition may also be a URL linking to a concept defined in an ontology" may be giving some permissions to do pretty much anything here?

In my opinion, this is all way too vague to provide the necessary interoperability as far as OGC API - Coverages is concerned. An implementor cannot be expected to try to make sense of this whole CIS dependency on SWE Common without some very clear guidance and examples.

pebau commented 4 years ago

yes, ideally no integer etc is used, but a more meaningful SWE URL. Unfortunately, neither SWE never completed that URL universe so it still seems missing today.

jerstlouis commented 4 years ago

CC @KathiSchleidt @joanma747

KathiSchleidt commented 4 years ago

You definitely want to reference whatever it is the number being provided in the rangeSet represents, something like: http://vocab.nerc.ac.uk/collection/A04/current/SPOT/

Side comment (not sure if fits here or elsewhere), I noticed an error in the JSON Schema for rangeType, "type" attribute of "field" is an enum, only contains "QuantityType", there are quite a few more under SWE

pebau commented 4 years ago

If I interpret it correctly this URL only says "SPOT", nothing else. However, the range type should express the technicality allowing to interpret it. The fact that it is SPOT does not tell me whether this is optical, SAR, etc, so I would rather expect that SPOT URL under the coverage metadata.

Re JSON issues, can we collect them in some place systematically for discussion? I could create a page on https://external.ogc.org/twiki_public/CoveragesDWG/WebHome, or do you know any other good place?

KathiSchleidt commented 4 years ago

Hi Peter, as you know, I'm pretty clueless as to what you satellite boys actually measure, just interested in the data structure. But, if you take the time to click on the link, you'll find out more. Also clueless if SPOT fits your current purpose, but if you navigate up one level from SPOT via broader, you come to: http://vocab.nerc.ac.uk/collection/A04/current/SatelliteImagery/ Maybe something more suitable there. Also - further information could be provided with each concept, and there are probably better vocabularies out there. I checked the nerc vocab as I know it from oceanographic properties, was quite surprised that also covers SatelliteImagery :)

pebau commented 4 years ago

@KathiSchleidt yes, that link was the second thing I tried; unfortunately, the server does not respond for me.

But anyway, these are just atomic facts, but not an ontology giving further insights to some tool evaluating it. Likely there is relevant work around (and most likely overlapping as you hint), but it is not my core field, too, so I am not sufficiently informed.

IMHO this is just too large to clarify in one ticket, it definitely deserves a research project - anybody interested to team up under a suitable funding scheme?

KathiSchleidt commented 4 years ago

@pebau ah - how did you find the 2nd link if the server is not responding? ;)

And - mixing the ObservableProperty (WHAT you're providing) with the measurement methodology behind it (the HOW of what you're providing) is just bad style from an O&M/SWE perspective - reason I've been pushing the approach of utilizing Observations to describe rangeType. Maybe too large to clarify in one ticket, but not sure we need a research project for this.

To my understanding, the approach of utilizing Observations as Coverage metainformation is not new, some folks are putting Observations into the Coverage Metadata. rangeType would be more correct. Need to take a 2nd look, but would be interesting to try and insert the Observation here

pebau commented 4 years ago

ah - how did you find the 2nd link if the server is not responding? ;)

interestingly the first link worked, clicking on the other one ran into timeout. Must be quantum fluctuations :)

KathiSchleidt commented 4 years ago

Ah - aurora borealis futzing with your connections up north? ;)

http://vocab.nerc.ac.uk/collection/A04/current/SatelliteImagery/ provides the following narrower options:

Narrower | http://vocab.nerc.ac.uk/collection/A04/current/MODIS/ Narrower | http://vocab.nerc.ac.uk/collection/A04/current/LANDSAT/ Narrower | http://vocab.nerc.ac.uk/collection/A04/current/StableLights/ Narrower | http://vocab.nerc.ac.uk/collection/A04/current/SPOT/

hylkevds commented 4 years ago

I've had to have a long look at SWE Common, since the STA Tasking extension uses it. Especially the DataRecord type. As far as I can see, the DataRecord type is the equivalent of a json class: A set of named fields, each field being an AbstractDataComponent. It is meant for grouping other components. The DataRecord type itself does not hold "data" as such, the subComponents do. As such, a DataRecord

However, the DataRecord UML Definition is broken, because it states:

The name of each field must be unique within a given “DataRecord” instance so that it can be used as a key to uniquely identify and/or index each one of the record components.

However, the AbstractDataComponent class does not define a field "name"... So each field in a DataRecord must have a unique name field that does not exist?

pebau commented 4 years ago

hm, my understanding always was that this allows to define the record components with their particular names, such as red, green, blue. (And no, I did not take "each field" literally = recursive, but just thought of the first level underneath Data Record.) Admittedly I did not reflect much on this, thought the SWEeties have made it all work. Difficult, though, was the lack of examples.

ghobona commented 4 years ago

@alexrobin @mikebotts Please have a look at the starting comment above and provide your views.

Cc: @ogcscotts

alexrobin commented 4 years ago

Hi all,

A DataRecord is indeed a group of fields, each one with a unique name. The field element with the name attribute is defined in the DataRecordType in the (XML schema)[http://schemas.opengis.net/sweCommon/2.0/record_components.xsd] and UML (Page 45 of the implementation spec), not as part as the AbstratDataComponent class.

The intent of the definition attribute on the DataRecord is purely semantic. It is meant to provide semantic information about the record as a whole, but should NOT be used to carry data type information. For example, in the case of a coverage range, it could be used to tag a group of bands together whenever it makes sense (e.g. something like http://someURI/LANDSAT/InfraredBands), although in most cases I think the semantic information should be provided on each band separately. I agree that it is probably not the best place to tag the data as being observed from a specific satellite (e.g. http://vocab.nerc.ac.uk/collection/A04/current/LANDSAT). I think higher level coverage metadata is probably a better place for this type of information than down inside the range description. I would also like to remind you that the DataRecord definition is not mandatory, while the definition of each range field is.

SWE Common separates structure and semantics from encoding, so the exact data type used for each range field should be provided separately. One way is to provide a SWE Binary encoding descriptor but it could also be done differently if it makes more sense in the Coverage API (for example, it can be provided by the encoding format itself, as most format such as NetCDF, TIFF, GRIB do already).

Regarding the JSON encoding, there is no official OGC standard that defines it but we have released an OGC Best Practice document with mappings from either SWE Common XML or UML to JSON. The BP document also provides examples that can hopefully help. To reuse the example provided above, the correct encoding would be:

"rangeType": {
  "@context": "http://localhost/json-ld/rangetype-context.json",
  "type": "DataRecord",
  "id": "examples_CIS_RT_10_2D",
  "fields":[{
    "name": "band1",
    "type": "Count",
    "id": "examples_CIS_RT_F_10_2D",
    "definition": "http://someURI/LANDSAT/Band7"
  },{
    "name": "band2",
    "type": "Quantity",
    "id": "examples_CIS_RT_F_10_2D",
    "label": "SWIR Band",
    "description": "SPOT-5 HRG SWIR Band, 1.58-1.75µm",
    "definition": "http://someURI/SPOT5/SWIR",
    "uom": {
      "code": "W.sr-1.m-2.Hz-1"
    }
  }]
}

The main changes are:

The type values must not include the Type suffix (DataRecordType -> DataRecord, QuantityType -> Quantity)
Each field must have a name
'id' properties must be valid against the XML ID datatype so cannot contain the : characters
The field definition should not be a data type, but reference semantic information about the band
If no unit is provided, the coverage value is often better modeled as a Count (i.e. digital pixel value) instead of a Quantity. I added a second field to show an example with an actual radiometric unit.

Note that it is possible in SWE Common to provide more information about the range values (like spectral domain, etc.) via the quality element in a more formal manner. I can provide examples if you're interested.

pebau commented 4 years ago

thanks, Alex! Any and all examples are welcome. Correct me if I am wrong, but I feel that CIS has done it almost right according to your material:

use of DataRecord (independent from the format) is pretty much as described AFAICS, except for what follows:
id in JSON contains ":" which disqualifies it for XML: "examples:CIS_RT_10_2D". However, our philsophy (which you may or may not share) was to allow max freedom per format and not impose the union of all restrictions to each format. Similar to PNG whose restrictions are not propagated into NetCDF.
Quantity in practice is used for data types simply in lack of a commonly accepted ontology (such as an OGC resolver branch). Once there is common agreement, that could be promoted into the communities.
which entails BTW that the uom is "1", in UCUM expressed as "10^0".

So a main question is: do we have a common terminology for measurement types? Also, while I understand that Quantity is on a higher semantic level, can we transport the data type somewhere in the DataRecord if needed?

KathiSchleidt commented 4 years ago

Stupid question - where do you take the "name" attribute from? To my understanding, SWE does not inherit from GML (not featureTypes), can't find name either in the UML model or in the XML encoding :? Kathi

pebau commented 4 years ago

PS: that uom remark of mine was nonsense in the sense that it is not directly related to the data type (mis)use under discussion.

alexrobin commented 4 years ago

Hi Peter,

Yes, I guess I should have started by saying that using the DataRecord to describe a coverage range was the right choice. It makes complete sense and keeps things well correlated with what we do in Sensor Web standards. The changes I recommended are just details. I do agree with your philosophy regarding not imposing ID syntax to the JSON encoding. We tend to do that in our implementation to ease translation between XML and JSON but it's not a hard requirement.

The best way to add the data type directly in the DataRecord description would be to add it as an extra property of each field I think. In XML we would do it via the SWE Common extensions hook, but in JSON, I don't see any problem making it an additional property of the object, like so:

"fields":[{
    "name": "band1",
    "type": "Count",
    "dataType": "http://www.opengis.net/def/dataType/OGC/0/float32",
    "id": "examples_CIS_RT_F_10_2D",
    "definition": "http://someURI/LANDSAT/Band7"
}

In that case, you could use any of the data type URIs defined in OGC registry here, but you could also use any other datatype taxonomy.

Regarding semantics of measured quantities, we tend to use QUDT for fundamental physical quantities, and we also maintain our own (often more specific) properties in the SWE/SensorML Property Ontology. We also maintain a Spectral Band Ontology that could be of interest to this group to better qualify coverage bands.

alexrobin commented 4 years ago

@KathiSchleidt The name is an attribute of the DataRecord/field element in the XML schema. It doesn't appear directly in the UML model because it is implied as part of the <<soft-typed>> property stereotype which was used in the UML to XML encoding rules at the time.

jerstlouis commented 4 years ago

@alexrobin Thank you so much for all the great insights on this issue!

I tend to agree that the data type is more a property of the encoding, so I would support leaving it up to the binary encoding to define this. Given that the CIS JSON representation of the range type is intended to describe the coverages potentially available in multiple encodings, which may well use different data types.

The missing semantic aspect, and misuse of the definition as a binary data type in the CIS 1.1 examples really was the core of this issue.

@pebau Regarding the units of measure, a unit of 1 or 10^0 tells me absolutely nothing as a user in what physical units things are measured. That is another part of the examples I had trouble with. W.sr-1.m-2.Hz-1 does.

@alexrobin Something else we were wondering (https://github.com/opengeospatial/ogcapi-coverages/issues/94) is how one could possibly describe in the DataRecord the minimum / maximum value that one might encounter, either in the dataset specifically, or in general. Would SWE have something for that?

And something related to the intersection of both these issues: how could one specifies how numerical values map to a physical unit, when some kind of scale factor, or linear mapping with an offset as well, has been established?

KathiSchleidt commented 4 years ago

@alexrobin but - also doesn't seem to be in the 2.0 version of the XML Schema, least not the one I'm accessing!

are we both refering to the DataRecordType from: http://schemas.opengis.net/sweCommon/2.0/record_components.xsd

alexrobin commented 4 years ago

@jerstlouis Yes, SWE has a mechanism to define constraints on each field. In JSON, a min/max constraint would look like:

{
    "name": "band1",
    "type": "Count",
    "dataType": "http://www.opengis.net/def/dataType/OGC/0/float32",
    "id": "examples_CIS_RT_F_10_2D",
    "definition": "http://someURI/LANDSAT/Band7",
    "constraint": {
        "type": "AllowedValues",
        "intervals": [ [0,255] ]
    }
}

There can be several intervals, hence the double array.

There is no built-in way in SWE to define a scale factor and offset. Scale factors can be provided in the UCUM uom code, either as a multiplier (e.g. 25.m) or as a divider (e.g. [deg]/60), but this is probably not enough for the coverage use case.

One possibility would be to think of scale and offset as an encoding issue as well, since AFAIK, they are mostly used for data storage optimization/compression purposes. Most geospatial file formats can handle that (at least HDF, NetCDF, GRIB, GeoTIFF do), but I guess we could have an issue if one of them doesn't...

alexrobin commented 4 years ago

@KathiSchleidt We are definitely talking about the same SWE Common schema. The name is declared as an attribute of the field element of DataRecordType.

<complexType name="DataRecordType">
  ...
  <element maxOccurs="unbounded" minOccurs="1" name="field">
    <complexType>
      <complexContent>
        <extension base="swe:AbstractDataComponentPropertyType">
          <attribute name="name" type="NCName" use="required"/>
        </extension>
      </complexContent>
    </complexType>
  </element>
  ...
</complexType>

I think what is confusing is that the JSON object is a fusion of the field property and its object value. The intent was to simplify the object hierarchy in JSON compared to XML.

KathiSchleidt commented 4 years ago

@alexrobin Sorry, my bad, hadn't seen the name attribute tacked on, was looking for an element like gml:name. Should have just looked at my own old examples! May I ask if you see any issues with the encoding below? What I came up with for air quality reporting years ago, would be curious if I did it correctly!

<swe:DataRecord>
    <swe:field name="StartTime">
        <swe:Time definition="http://www.opengis.net/def/property/OGC/0/SamplingTime">
            <swe:uom xlink:href="http://www.opengis.net/def/uom/ISO-8601/0/Gregorian"/>
        </swe:Time>
    </swe:field>
    <swe:field name="EndTime">
        <swe:Time definition="http://www.opengis.net/def/property/OGC/0/SamplingTime">
            <swe:uom xlink:href="http://www.opengis.net/def/uom/ISO-8601/0/Gregorian"/>
        </swe:Time>
    </swe:field>
    <swe:field name="Verification">
        <swe:Category definition="http://dd.eionet.europa.eu/vocabularies/aq/observationverification"/>
    </swe:field>
    <swe:field name="Validity">
        <swe:Category definition="http://dd.eionet.europa.eu/vocabularies/aq/observationvalidity"/>
    </swe:field>
    <swe:field name="Value">
        <swe:Quantity definition="http://dd.eionet.europa.eu/vocabulary/aq/primaryObservation/hour">
            <swe:uom xlink:href="http://dd.eionet.europa.eu/vocabularyconcept/uom/concentration/ug.m-3"/>
        </swe:Quantity>
    </swe:field>
</swe:DataRecord>

jerstlouis commented 4 years ago

@alexrobin Great that we can do things like 25.m or as a divider [deg]/60. As UCUM is yet another dependency that implementors might not be familiar with, I suggest CIS/OGC API - Coverages should again provide guidance and clear examples for the 80% common use cases scenarios.

Most geospatial file formats can handle that (at least HDF, NetCDF, GRIB, GeoTIFF do), but I guess we could have an issue if one of them doesn't...

That is exactly the trouble. This comes up for example trying to use a generic 16-bit PNG to encode the values. The GeoPackage Gridded Coverage Extension for example allows to define a linear transform on either a global or per-tile basis, and that transform is stored in ancillary tables. This might be outside the scope of SWE Common/RangeType, but at the moment we have nowhere else for this. I wonder whether HTTP response header would be an appropriate place for this to complement the encoding that don't support this (which could be localized to the subset response)?

jerstlouis commented 3 years ago

@alexrobin @Schpidi A practical question regarding relating to implementing coverages styling in our client:

How could the range type / SWE Common DataRecord describe that the sentinel-2 radiance values are 12-bit values, which should be scaled down by 1.0 / 4096 for example? (assuming my understanding is correct)

Would that be something like "4096.W.sr-1.m-2.Hz-1" ?

If the GeoTIFF encoding uses 16-bit integers, is there any GeoTIFF tag inside GeoTIFF that would otherwise carry that information? (so that the CIS JSON and GeoTIFF encoding are consistent, and can share the same range type information).

If that tag is not present, that would force the CIS JSON encoding to be consistent, so that the values there are also scaled up to 12-bit values by multiplying by 4096.

Thank you!

EmDevys commented 3 years ago

FYI GeoTIFF includes no information on radiometry, this is part of TIFF, with 3 TIFF tags BitsPerSample and PhotometricInterpretation + SamplesPerPixel. GeoTIFF is only (this is its scope) CRS definition and georeference of image / grid to ground CRS

jerstlouis commented 3 years ago

Thank you @EmDevys . What I should have said is TIFF of GeoTIFF tags. But as you point out the GeoTIFF part is irrelevant...

EmDevys commented 3 years ago

By the way, I already pointed out that documenting the rangeType in CIS need some rules and illustration by some valid examples. I submitted an proposal for Discussion Paper in 2017, that I presented to Coverage.DWG but there was no solid action following this (see https://portal.ogc.org/files/?artifact_id=72894&version=1).

EmDevys commented 3 years ago

And the Coverage rangeType should document such information on the coverage data (for all types of coverages including grids/raster/images) in an harmonised way. Ready to contribute any action on this

jerstlouis commented 3 years ago

@EmDevys Please join us next Wednesday morning call 9:00 AM Eastern if you can! I hope we can focus on this, I think it is critical. I suggest we develop clear examples, and perhaps the Coverages API could standardize this, if not CIS 1.2.

EmDevys commented 3 years ago

I’ll try to arrange to participate, and review this into more details.

Emmanuel Devys IGN Département Normalisation et référentiels projets| Service des Projets et Prestations direction des programmes et de l’appui aux politiques publiques T +33 (0) 1 43 98 85 75 ign.frhttp://www.ign.fr/ – geoportail.gouv.fr

De : Jerome St-Louis [mailto:notifications@github.com] Envoyé : jeudi 10 décembre 2020 17:51 À : opengeospatial/ogcapi-coverages Cc : Emmanuel Devys; Mention Objet : Re: [opengeospatial/ogcapi-coverages] RangeType / SWE Common Confusion (#102)

@EmDevyshttps://github.com/EmDevys Please join us next Wednesday morning call 9:00 AM Eastern if you can! I hope we can focus on this, I think it is critical. I suggest we develop clear examples, and perhaps the Coverages API could standardize this, if not CIS 1.2.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/opengeospatial/ogcapi-coverages/issues/102#issuecomment-742647148, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACDHG2XQ2DMSBTGZMJVM6WLSUD37LANCNFSM4TJVVIGQ.

alexrobin commented 3 years ago

@jerstlouis In SWE standards, we don't usually use data records to provide encoding level details (bits per sample, etc.). Instead, this information is provided separately (in a DataStream or DataArray encoding section, see BinaryEncoding for example).

However, I see no issue defining an extension to be embedded inside each SWE Common field to provide such info within the record itself. In that case, one possibility would be to create a JSON object that would contain all encoding information in one place, not only the data type.

For instance, going back to our earlier example, it could be something like:

{
    "name": "band1",
    "type": "Quantity",
    "id": "examples_CIS_RT_F_10_2D",
    "definition": "http://someURI/LANDSAT/Band7",
    "encodingInfo": {
        "dataType": "http://www.opengis.net/def/dataType/OGC/0/unsignedInt",
        "significantBits": 12,
        "scale": 0.000244141,
        "offset": 0
    }
}

Note that dataType and significantBits are properties that are already defined in SWE Common, as part of the BinaryEncoding object I mentioned above. They are used to mean exactly the same thing, although they are used in a different place in the schema. But at least, calling them the same here would keep things semantically consistent.

alexrobin commented 3 years ago

We could even offer the possibility to provide the scale as a ratio (e.g. "1/4096") if we want to keep the maximum precision.

EmDevys commented 3 years ago

I must confess that I find it uneasy to elaborate any religion on which semantics should be documented in the rangeType for Coverages, which is an harmonized model for various type of coverage data (raster / grids, but not only) and various types of encoding standards, and in fine various types of values (integers - signed or not, float, double, complex) ... For the semantics of the rangeType would gain to be clarified, as well as the Coverage model and schemas, as identified in this issue.

My impression is that binary encoding is the scope of the encoding standard (as reminded previously in this discussion), and may be unnecessary in the Coverages API (except if restricted intervals of values apply, which is a usage constraint). The rangeType should document the semantics for the Coverage range, and Coverage values usage, at minimum. For example elevation, radiometry, codelists for thematic imagery ... Should the rangeType go beyond (at the byte or bit level) this is to be discussed / confirmed. By the way, the DP document on Coverage rangeType semantics referenced above (https://portal.ogc.org/files/?artifact_id=72894&version=1) was applying to CIS 1.0 / GMLCOV, but is still valid on CIS 1.1 and its JSON encoding.

Last point, in addition to coverage data, users may get following information, sometimes overlapping (e.g. in INSPIRE with data such as orthoimagery or elevation), and therefore with some redundant information:

encoding format metadata (e.g in TIFF/ GeoTIFF or netCDF or JPEG 2000/GMLJP2 or PNG or JPEG - last 2 not geo-enabled)
coverage rangeType, under SWE Common data record
additional metadata (source, quality, content description ...), handled under 19115 set of standards, for discovery and cataloging.

Presumably the Coverages API should be able to handle such information (as available), but I am not sure the Coverage model and corresponding XML or JSON schemas should handle all this. But probably in a similar way as the WCS up to now. Just some thoughts ... I apologize, I am not in a position to contribute to this work other than reviewing and commenting when the proposed draft specification is submitted.

pebau commented 3 years ago

@EmDevys you are making a very valid point indeed - great that you remind us of the big picture from time to time!

Having units of measure in the range tpye - we likely agree - is essential. UCUM has been picked not because of its beauty but...well, it is a standard. Injecting all sorts of extra mechanics like styling, transformation into other value systems, etc. establishes some (increasingly complex) microsyntax which less than the majority of tools is likely to implement.

Further, a general mechanism requires very intricate scrutiny about consequences in all possible coverage applications. What about defining metadata records instead that go into the coverage metadata slot? INSPIRE is doing that already, and there is place for more as long as we define own namespaces and manage metadata root elements somewhere (lightweight: on the DWG wiki, heavyweight: OGC-NA). That would allow us to establish building blocks capturing format and application specifics while still going together without friction. Ideally we would offer encodings for each in XML, JSON, and RDF.

Just some wild pre-Xmas idea.

KathiSchleidt commented 3 years ago

@pebau the trick we did for INSPIRE by shifting their excess baggage to the metadata slot (they'd extended RectifiedGridCoverage by derivation, doesn't work with WCS) should be discussed a bit more if we want to do this correctly, as to my view this type of detailed information on the content of the range would actually be more suited to rangeType (still learning Coverage models by doing ;) )

Please correct me if I'm wrong, but the metadata pertains to the entire range, while the rangeType can reference individual components. For INSPIRE I'm not that worried providing info in metadata as will probably be only one range component, but the moment multiple range components are provided, it becomes unclear what the info in the metadata describes, thus rangeType is more suited.

I do believe that some clean valid examples would be very valuable. In this context I'd also like to explore the potential of utilizing the new O&M V3 ObservationCharacteristics for enhanding the swe:DataRecord (to my reading, this could be added to the extension of the SWE Types provided in the swe:field.) While I don't work with classical coverage topics, I could provide a CIS 1.1 encoded O&M Depth profile to the mix.

pebau commented 3 years ago

@KathiSchleidt

Referring to individual bands in the metadata is no problem, it just needs to be defined properly in the schema. There is precedent with the spatial extension where in a metadata slot you also can indicate regions (in space and time!) to which the particular metadata record applies. EO-WCS uses even polygons to isolate contributing footprints.

Changing the range type per se smells like CIS 2.0 so has a high probability of making existing coverages incompatible - not sure this high price is worth it if alternatives exist.

If however you find a way of circumventing a CIS 2.0 I'd sleep way better - so I'm excited about your vision of a solution that both SWE and CIS enjoy!

Schpidi commented 3 years ago

Coverages SWG call:

We need to fix and potentially extend the existing examples and want to add further ones based on the comments above.

Current list of examples:

[ ] 00_metadata.json
[ ] 05_2D_index.json
[ ] 10_2D_regular.json
[ ] 11_2D_regular_fileref.json
[ ] 12_2D_regular_fileref_multiband.json
[ ] 15_with-envelope.json
[ ] 20_3D_height.json
[ ] 25_3D_time.json
[ ] 30_4D_height+time.json
[ ] 40_1D_regular.json
[ ] 45_2D_distorted.json
[ ] 46_irregular+distorted.json
[ ] 50_3D_partitioned-1.json
[ ] 55_1D_timeseries-partitioned.json
[ ] 60_3D_timeseries-multipart.json
[ ] 65_1D_timeseries-interleaved.json
[ ] 70_2D_interpolation.json
[ ] 80_sensormodel.json
[ ] 90_point-cloud.json

Proposed additions:

[ ] Classification
[ ] DEM (how to handle both float & int encoding for the same coverage)
[ ] RGB
[ ] RGBA
[ ] Multispectral & Panchromatic, i.e., bands with different resolution
[ ] Hyperspectral

pebau commented 3 years ago

Let me suggest to not just add the examples, but comment them and make clear what is normative, what is convention, what is experimental, etc. - the Coverages.DWG might be a good place to host such background information.

jerstlouis commented 3 years ago

At the 2020/12/16 SWG meeting, I suggested that the range type should provide semantic information about range set values and how to interpret reference real-world values, but not encoding details which may be specific to only some encodings such as binary data types and type sizes

It was also suggested that this encoding-specific information could also be provided in an HTTP response header for formats which do not allow to describe such things (e.g. this would allow to map real floating-point values to 16-bit PNG, which does not allow specifying any kind of linear mapping).

KathiSchleidt commented 3 years ago

hmm... if rangeType is settled to that point, it may be worth revisiting allowing the provision of the new ObservationCharacteristics type from O&M V3 as an alternative rangeType description - would make for far easier crosswalks between representation types. Thoughts?

KathiSchleidt commented 3 years ago

@jerstlouis just scanned the examples, rangeType still seems overly generic, e.g.

"uom": {
        "type": "UnitReference",
        "id": "examples:CIS_RT_F_UOM__30_4D",
        "code": "10^0"
}

I think the example would be easier to understand with a QuantityType based on a real quantity

pebau commented 3 years ago

@KathiSchleidt hm, maybe this is an instruction to humans: RT_F_UOM = read the... ;-)

jerstlouis commented 3 years ago

@KathiSchleidt That particular example has not been tackled yet. It's also still work in progress, but take a look at the ones directly in https://github.com/jerstlouis/coverage-implementation-schema/tree/main/standard/schemas/1.1/json/examples/generalGrid It's also still quite generic, given that we don't have very specific examples yet. There might be value in separately providing more specific examples.

alexrobin commented 3 years ago

FYI I just uploaded the draft JSON schema for SWE Common 2.0 in the O&M SWG repository: https://raw.githubusercontent.com/opengeospatial/om-swg/json_schemas/json/swe-common_v20_schema_draft.json

Thought it could be useful for you to validate that the coverage "rangeType" examples are correct.

pebau commented 3 years ago

@joanma747 can you please have a look, too, being the original writer of the CIS 1.1. JSON schema? You certainly are most qualified. Thanks!

KathiSchleidt commented 2 years ago

I've recently been chewing on RangeType encoding for elevation data in INSPIRE as the eternal example of quantity definition: http://www.opengis.net/def/dataType/OGC/0/float32 with a UoM code of 10^0 is not really satisfactory. The information we really want to provide is:

vertical CRS: in this case EPSG::9390, name "EVRF2019 mean-tide height"
the UoM: in this case meter

The current approach is as follows:

<gmlcov:rangeType>
    <swe:DataRecord>
        <swe:field name="Elevation">
            <swe:Quantity definition="EPSG::9390" axisID="Z">
                <swe:label>Elevation</swe:label>
                <swe:description>EVRF2019 mean-tide height - EPSG:9390</swe:description>
                <swe:nilValues>
                    <swe:NilValues>
                        <swe:nilValue reason="">1000000.0</swe:nilValue>
                    </swe:NilValues>
                </swe:nilValues>
                <swe:uom code="m"/>
            </swe:Quantity>
        </swe:field>
    </swe:DataRecord>
</gmlcov:rangeType>

Does this make sense, or does somebody see an issue?

opengeospatial / coverage-implementation-schema

RangeType / SWE Common Confusion #1