RDFLib / pySHACL

A Python validator for SHACL
Apache License 2.0
241 stars 63 forks source link

SPARQL ASK rule incorrectly passing #205

Closed rob-metalinkage closed 9 months ago

rob-metalinkage commented 9 months ago

AFAICT the following rule:

(https://raw.githubusercontent.com/opengeospatial/ogcapi-sosa/master/_sources/features/observationCollection/rules.shacl)

@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh:      <http://www.w3.org/ns/shacl#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix geojson: <https://purl.org/geojson/vocab#> .
@prefix sosa:    <http://www.w3.org/ns/sosa/> .

@base <http://example.com/rules> .

<#testRequiredInCollectionOrMember>
    a                   sh:NodeShape ;
    sh:targetSubjectsOf sosa:hasResult, sosa:hasSimpleResult ;
    sh:name             "Has Observed Property" ;
    sh:message          "Observations must either declare observed property or be a member of a collection hierarchy that declares it" ;
    sh:ask              """
        PREFIX dct: <http://purl.org/dc/terms/>
        PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
        prefix sosa:    <http://www.w3.org/ns/sosa/>
        {
            { ?this sosa:observedProperty ?prop }
               UNION
            {
              ?coll sosa:observedProperty ?prop .
              ?coll sosa:hasMember ?this
            }
        }

""" ; .

seems to pass in PySHACL (even though no property sosa:observedProperty exists in either object or the collection )

(this fails correctly on TopQuadrant EDG) on this data:

https://raw.githubusercontent.com/opengeospatial/ogcapi-sosa/master/_sources/features/observationCollection/tests/observationCollection-props-fail.ttl

rob-metalinkage commented 9 months ago

Note that the alternative form of this rule I tried also fails in PySHACL and passes in EDG.

https://raw.githubusercontent.com/opengeospatial/ogcapi-sosa/master/_sources/features/observationCollection/rules-inversepaths.shacl

If this is a different error then can submit a new issue.

If error is confirmed can submit a test case.

ashleysommer commented 9 months ago

Thanks @rob-metalinkage, I'm looking into this now.

ashleysommer commented 9 months ago

Okay, @rob-metalinkage two different problems I found here while testing this case:

1) The data graph is incomplete. It appears to be a JSON-LD graph using GEOSJON-LD, but there is no geojson namespace in the @context mapping. Further, there is no sosa namespace in the @context mapping, making the whole graph completely unidentifiable. I don't know if there is some kind of json-ld context innoculation in that test suite that I am missing, but in order to get the graph to work I needed to add all the missing contexts:

{
   "@context": {
    "@version": 1.1,
    "geojson": "https://purl.org/geojson/vocab#",
    "Feature": "geojson:Feature",
    "FeatureCollection": "geojson:FeatureCollection",
    "GeometryCollection": "geojson:GeometryCollection",
    "LineString": "geojson:LineString",
    "MultiLineString": "geojson:MultiLineString",
    "MultiPoint": "geojson:MultiPoint",
    "MultiPolygon": "geojson:MultiPolygon",
    "Point": "geojson:Point",
    "Polygon": "geojson:Polygon",
    "features": {
      "@container": "@set",
      "@id": "geojson:features"
    },
    "geometry": "geojson:geometry",
    "id": "@id",
    "properties": "geojson:properties",
    "type": "@type",
    "sosa": "http://www.w3.org/ns/sosa/",
    "Observation": "sosa:Observation",
    "Sample": "sosa:Sample",
    "hasFeatureOfInterest": "sosa:hasFeatureOfInterest",
    "hasResult": "sosa:hasResult",
    "hasSimpleResult": "sosa:hasSimpleResult",
    "observedProperty": "sosa:observedProperty"
  },
  "@id": "c1",
  "type": "FeatureCollection",
  "featureType": "sosa:ObservationCollection",
  "properties": {
    "resultTime": "1999"
  },
  "features": [
    {
      "@id": "pop1999",
      "type": "Feature",
      "geometry": null,
      "properties": {
        "comment": "Simple result case",
        "hasFeatureOfInterest": "https://demo.pygeoapi.io/master/collections/utah_city_locations/items/Spanish%20Fork",
        "hasSimpleResult": 15555.0
      }
    }
  ]
}

2) The Shapes Graph is also malformed. The sh:ask rule is used directly on the NodeShape, that is not a supported pattern in the SHACL spec. sh:ask can only be used in SPARQL Constraint-components, they are an advanced use case where you define your own SHACL Constraint Components, and define Validator implementations that can be used to action your Constraint Components, where the validator can use either sh:ask or sh:select to perform the lookup.

It appears the author of this shape has wanting to implement something similar to a simple SPARQL Constraint, but they need to use sh:sparql for that, and it only supports sh:select on it.

Please take a look at examples from the W3C Spec document for examples of simple SPARQL Constraint and the alternative SPARQL Constraint-components to see further details of where the problems lie.

Note, running the SHACL Shape Graph through the meta-shacl SHACL-SHACL validator doesn't throw any issues, I think this is because SHACL-SHACL doesn't go as far as SHACL-SPARQL nodes and constraints.

ashleysommer commented 9 months ago

I took a look at the alternate implementation. Again, two issues found:

1) There is a formatting issue in the Shape Graph, the sh:message property is accidentally duplicated without a turtle punctuation in between. This breaks the RDFLib Turtle Parser.

2) After fixing the above, the Shape passes (fails to throw a validation report) in PySHACL because of the incorrect use of the sh:or constraint. The sh:or constraint will pass if either arm (either PropertyShape) passes. In this case, the PropertyShape looking at observedProperty fails as it should, but the second PropertyShape passes. The second PropertyShape looks at (inverse) path sosa:hasMember of which there are no nodes in the graph, so that Shape does not apply to any nodes, there is no minCount constraint on this property, so because there are no focus nodes to check it does not throw any reports, this is the same as a Pass in SHACL world. So the second arm of the sh:or passes, causing the sh:or constraint to pass validation, so the Data Graph is valid.

rob-metalinkage commented 9 months ago

Ok, wrong link to resource, should have been the TTL not json ld, but the issue is ask not supported and no suitable error thrown..

Will rebuild with a select but suggest you fix that behaviour.

Will double checknif Pyshacl behaves differently to TQ on no nodes matching on sh:or

ashleysommer commented 9 months ago

issue is ask not supported and no suitable error thrown suggest you fix that behaviour.

PySHACL is operating in accordance with the W3C spec. There is no error occuring here, no error message to be shown. There is nothing to fix. What I explained in the previous post is not a limitation of PySHACL, please read the sections of the spec that I linked above, see the examples given for how to properly use sh:sparql, sh:select, and sh:ask.

rob-metalinkage commented 9 months ago

OK thanks - I see that indeed ASK is only supported in custom validators. The spec is rather opaque - there is no obvious example of how a constraint component is attached to a nodeshape.

OTOH https://www.w3.org/TR/shacl/#sparql-constraints-example seems to point to an example with a sh:sparql Constraint - which is what I tried (see below) before trying using the ASK (incorrectly I see).

I fixed the link to the data to be the TTL (the JSON-LD uses some nested contexts driven by schema references - so wasnt complete - sorry about that)

I reverted to select and seems to behave as expected now - I'm not sure but I must have has some other error.