SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
72 stars 24 forks source link

SHACL: GeoNamesRestriction-Shape Is Not Fulfilled By Using GeoNames-URIs #218

Closed init-dcat-ap-de closed 4 weeks ago

init-dcat-ap-de commented 2 years ago

The GeoNamesRestriction-shape is not fulfilled by using geonames-uris:

    a sh:NodeShape ;
    rdfs:comment "Geo names restriction" ;
    rdfs:label "Geo names restriction" ;
    sh:property [
        sh:hasValue <http://sws.geonames.org> ;
        sh:minCount 1 ;
        sh:nodeKind sh:IRI ;
        sh:path skos:inScheme
    ] 

If I use e.g. <https://sws.geonames.org/2921044/>, the RDF does not include an inScheme path.

Possible solution: only test that the IRI starts with "https://sws.geonames.org/" (Could be applicable to the IANA-lists as well)

bertvannuffelen commented 2 years ago

I understand the approach. I am personally reluctant to base validation on string processing, but I agree that the current rule is not much of a validation rule.

bertvannuffelen commented 1 year ago

I did additional checks on data.europa.eu and I discovered the use of

On the first, maybe in the past HTTP was published, but today it is only HTTPS. On the second, the usual challenge to ensure that the right identifiers are used instead of the browser URLs.

I also looked at the geonames ontology, which indeed does not defines the geonames instances as skos:Concepts. And it is hard to do a full download of all URIs before validation.

So maybe for these cases a "textual" test could be appropriate.

bertvannuffelen commented 4 weeks ago

In the mdr-vocabularies.shape.ttl now a test on the prefix of the URIs has been added.

:GeoNamesRestrictionRegexURI
    rdfs:comment "Geonames restriction - base itself on URI structure" ;
    rdfs:label "Geonames restriction" ;
    a sh:NodeShape ;
    sh:pattern "^https://sws.geonames.org" .