agiorguk / gemini-schematron

The Schematron files to support GEMINI 2.3 validation
0 stars 1 forks source link

2020-21 Validating Spatial reference system that are not provided by Anchor (Jo Cook, July 2020) #15

Open PeterParslow opened 3 years ago

PeterParslow commented 3 years ago

it would appear that the schematron is very limited when it comes to checking SRS that are not provided using an anchor. In the attached screenshot, none of the provided CRS actually validate against the schema, even the ones related to EPSG 4258.

This seems contrary to the guidance?

PeterParslow commented 3 years ago

JP: Note this is the supplementary schematron, so gives information and recommendations, and is intended as a how might I make my metadata better, rather than this needs fixing messages.

JP: Also note (not checked) that ETRS89 by itself is not the recognised short name for EPSG:4258 according to INSPIRE it should be ETRS89-GRS80

The metadata Annex D.4 table comes from INSPIRE Data Specification on Coordinate Reference Systems, version 3.2, Table 1 in section 5.5 It has a requirement that that TG Requirement 1 The identifiers listed in Table 1 shall be used for referring to the coordinate reference systems used in a data set.

For ETRS89-GRS80 (EPSG::4258) the identifier is http://www.opengis.net/def/crs/EPSG/0/4258

The full list of CRS identifiers checked are those found online at : https://agi.org.uk/images/xslt/d4.xml

The schematron rule 4b checks if the value of the character string matches any URI in the list. 4a checks if the value of the href xlink matches a URI. This isn’t a mistake, it’s to allow users to provide identifiers in the character string, but of course the recommended way now would be to use gmx:Anchors.

(see attached screen ScottishSDI Screenshot shot)

Note: Jo’s context is work to enable GEMINI 2.3 on the Scottish SDI

JP: I don’t think the rule firing is incorrect, but perhaps the message should be changed to instead say something like, “The identifier used for this CRS is not from the INSPIRE list of default identifiers. According to the INSPIRE Data Specification on Coordinate Reference Systems identifiers used to describe data sets shall be HTTP-URIs if the CRS is listed in Table 1, (applies to EPSG::4936,EPSG::4937,EPSG::4258,EPSG::3035,EPSG::3034,EPSG::3038,EPSG::3039,EPSG::3040,EPSG::3041,EPSG::3042,EPSG::3043,EPSG::3044,EPSG::3045,EPSG::3046,EPSG::3047,EPSG::3048,EPSG::3049,EPSG::3050,EPSG::3051,EPSG::5730,EPSG::5861,EPSG::5715,EPSG::7409, and http://codes.wmo.int/grib2/codeflag/4.2/_0-3-3.)”

PeterParslow commented 3 years ago

Peter: It’s rule 4b that’s firing, the one which checks gmd:code/gco:CharacterString, not the gmx:Anchor (rule 4a). This now looks to me like a contradiction in the INSPIRE TG document between Requirement 2.2 and Example 3.13 immediately below it. The Schematron rule 4a checks the requirement, that says the HTTP URI identifier shall be used as the value of the code element. Your screenshot suggests you have populated it with URN identifier; that may be more commonly used, but it’s not mentioned in the INSPIRE TG.

PeterParslow commented 3 years ago

JP: I don’t think it’s a contradiction, it doesn’t match the example but, it would still be valid to have:

<gmd:code>
    <gco:CharacterString>http://www.opengis.net/def/crs/EPSG/0/4258</gco:CharacterString>
</gmd:code>

Recommendation 2.2 tells us (only that) A gmx:Anchor element should be used, whose xlink:href attribute refers to a URI that provides further information about the spatial reference systems using geographic identifiers.

PeterParslow commented 3 years ago

Perhaps we should report it to INSPIRE?

PeterParslow commented 3 years ago

INSPIRE are, quite clear that you shouldn’t be using URN form of identifiers for the default CRS identifiers, (though one of the BGS examples records uses them) so we shouldn’t have:

<gmd:code>
  <gco:CharacterString>urn:ogc:def:crs:EPSG::4258</gco:CharacterString>
</gmd:code>

But the following should be valid if you wanted to reference the URN as well:

<gmd:code>
  <gmx:Anchor 
    xlink:href="http://www.opengis.net/def/crs/EPSG/0/4258" 
    xlink:title="ETRS89-GRS80">urn:ogc:def:crs:EPSG::4258
  </gmx:Anchor>                     
</gmd:code>

As indeed might:

<gmd:code>
  <gmx:Anchor 
    xlink:href="http://www.epsg-registry.org/export.htm?wkt=urn:ogc:def:crs:EPSG::4258" 
    xlink:title="ETRS89-GRS80">http://www.opengis.net/def/crs/EPSG/0/4258
 </gmx:Anchor>                     
</gmd:code>

Perhaps we should change the guidance?

PeterParslow commented 3 years ago

First answer, which I now think is wrong: here I think it is that GEMINI needs to be more explicit: INSPIRE TG Requirement 2.2: metadata/2.0/req/isdss/crs-id states “If the coordinate reference system is listed in the table Default Coordinate Reference System Identifiers in Annex D.4, the value of the HTTP URI Identifier column shall be used as the value of gmd:referenceSystemInfo/gmd:MD_ReferenceSystem/ gmd:referenceSystemIdentifier/gmd:RS_Identifier/gmd:code element. The gmd:codeSpace element shall not be used in this case.”

PeterParslow commented 3 years ago

JP, it would be impossible(?) to write a validation rule that passes or fails a service record depending on whether an identifier in a list is used when it is valid (if rare?) to have an identifier outside of the list. There is also the issue, that the default identifier may change, D.4 tells us “In the case that this set of identifiers should be changed or corrected in the later versions of the INSPIRE Data Specification on Coordinate Reference Systems, the changed version of the identifier set should be preferred over the one provided here.”

PeterParslow commented 3 years ago

So to stick with the principle that ‘a GEMINI record shall be an INSPIRE record’ (which I take to mean ‘shall conform to the INSPIRE Metadata TG’), GEMINI should say the same, i.e. we should change the Encoding Guidelines slightly, so that if the SRS is an INSPIRE default, gmx:Anchor shall be used. If it’s not an INSPIRE default, then I support still saying gmx:Anchor should be used.

PeterParslow commented 3 years ago

JP: It makes GEMINI more strict than INSPIRE that only recommends (2.2) use of gmx:Anchor for CRS identifiers. Could add a rule to the supplemental schematron to notify deviation from the recommendation..