weso / rdfshape-api

API for validating and transforming RDF, ShEx, SHACL and more.
https://www.weso.es/rdfshape-api/
MIT License
36 stars 10 forks source link

Negated path not leading to failure #14

Closed AlasdairGray closed 2 years ago

AlasdairGray commented 10 years ago

Simple shape with one required type and one type that is negated.

prefix dctypes: <http://purl.org/dc/dcmitype/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix void: <http://rdfs.org/ns/void#>

<SummaryShape> {
    rdf:type (dctypes:Dataset),
    !rdf:type (void:Dataset)
}

When tested with a document containing a dctypes:Dataset and void:Dataset, the result is still matching a summary shape.

prefix dctypes: <http://purl.org/dc/dcmitype/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix : <http://example.org/>
prefix void: <http://rdfs.org/ns/void#>

:dataset a dctypes:Dataset, void:Dataset .

I believe that it should not match the summary shape. You can try it out at http://goo.gl/jnHafO

labra commented 10 years ago

The problem is the interaction between negation and open shapes. If you use a closed shape it fails as expected (http://goo.gl/buwmvb).

However, when using open shapes, the current implementation accepts "any" remaining triple (so it validates SummaryShape because considering that is has type dctypes:DataSet and assumes that :dataset a void:DataSet is a remaining triple, i. e. the current implementation of open shapes is:

  <shape> {            <shape> [
      expr        =     expr ,
  }                     . . *
                       ] 

so your definition was equivalent to:

 <SummaryShape> = [ 
     rdf:type (dctypes:Dataset),
     !rdf:type (void:Dataset), 
     . . *
    ]

A possible solution to have "almost" open shapes in your case would be:

 <SummaryShape> [
     a (dctypes:Dataset),
     a (- void:Dataset)*,
     - rdf:type . *
 ]

You can see it here: http://goo.gl/w3jiL7

Another solution is to change the implementation of open shapes to something like:

 <shape> {     <shape> [
    expr    =   expr
  }             (!(expr) . . *)
               ]

but I am not sure how to implement the negation of expressions right now...

AlasdairGray commented 10 years ago

I will try with closed shapes, but I'm not entirely sure I understand their semantics.

I see from your example that the graphs they match can have additional properties that are not in the ShEx. I find this surprising as I'd have expected in closed world that you needed to match exactly.

For my use case I do need to allow graphs with additional properties; the full set of additional properties a user may supply is unknown to me. So the implementation of closed shapes seems to match my requirements.

labra commented 10 years ago

Closed shapes mean that they match exactly the triples defined by the shape without allowing extra remaining triples. For example, if you have a shape like:

 <person> [ foaf:name xsd:string ]

and some values:

 :john foaf:name "John" .

 :anna foaf:name "Anna"; 
           foaf:mbox <mailto:anna@example.com> .

it will only match :john and not :anna, because she has a remaining tripl which does not match the shape.

It is simple to convert a closed shape into an open shape just by saying:

 <person> [
    foaf:name xsd:string
  , . . *    # this means that it allows any extra remaining triple with any value
  ]

which would mean that the shape person has property foaf:name with a value in xsd:string and any extra remaining triple. That's how it is implemented right now. The system converts every open shape to a closed shape extended with ". . *".

Extending closed shapes with negations in this way does not work when you combine them with negations because they accept as remaining triples the negated ones.

That's why in your example, I extended the shape not with any remaining triples, but only with any remaining triple that didn't have the property rdf:type.

<SummaryShape> [
      a (dctypes:Dataset),
      a (- void:Dataset)*,
      - rdf:type . *     # this extends the shape to any triple that does not have the property rdf:type
  ]

However, these days, I have been thinking that it may be better to modify the way that open shapes are implemented and validate only when there are remaining triples that have not been consumed by the validator. I will try to modify the implementation and then your example will also work as you expected...so by now, I keep your issue open :)

AlasdairGray commented 10 years ago

I think the behaviour or matching any triples, even those that have already been checked by the negated paths is an error.

I look forward to the updated version.

On 29 Aug 2014, at 06:56, Jose Emilio Labra Gayo notifications@github.com<mailto:notifications@github.com> wrote:

Closed shapes mean that they match exactly the triples defined by the shape without allowing extra remaining triples. For example, if you have a shape like:

[ foaf:name xsd:string ] and some values: :john foaf:name "John" . :anna foaf:name "Anna"; foaf:mbox mailto:anna@example.com . it will only match :john and not :anna, because she has a remaining tripl which does not match the shape. It is simple to convert a closed shape into an open shape just by saying: [ foaf:name xsd:string , . . \* # this means that it allows any extra remaining triple with any value ] which would mean that the shape person has property foaf:name with a value in xsd:string and any extra remaining triple. That's how it is implemented right now. The system converts every open shape to a closed shape extended with ". . *". Extending closed shapes with negations in this way does not work when you combine them with negations because they accept as remaining triples the negated ones. That's why in your example, I extended the shape not with any remaining triples, but only with any remaining triple that didn't have the property rdf:type. [ a (dctypes:Dataset), a (- void:Dataset)*, - rdf:type . \* # this extends the shape to any triple that does not have the property rdf:type ] However, these days, I have been thinking that it may be better to modify the way that open shapes are implemented and validate only when there are remaining triples that have not been consumed by the validator. I will try to modify the implementation and then your example will also work as you expected...so by now, I keep your issue open :) — Reply to this email directly or view it on GitHubhttps://github.com/labra/rdfshape/issues/14#issuecomment-53840017. --- We invite research leaders and ambitious early career researchers to join us in leading and driving research in key inter-disciplinary themes. Please see www.hw.ac.uk/researchleaders for further information and how to apply. Heriot-Watt University is a Scottish charity registered under charity number SC000278.