shexjs / shex.js

shex.js javascript package
MIT License
60 stars 17 forks source link

Nested EachOf results have extra expressions #70

Open joeltg opened 4 years ago

joeltg commented 4 years ago

Suppose I have a schema made of two EachOf expressions:

PREFIX schema: <http://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

start={
  rdf:type [ schema:Person ],
  (
    schema:givenName xsd:string ,
    schema:familyName xsd:string
  ) *
}

... this parses into

{
  "type": "Schema",
  "start": {
    "type": "Shape",
    "expression": {
      "type": "EachOf",
      "expressions": [
        {
          "type": "TripleConstraint",
          "predicate": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
          ...
        },
        {
          "type": "EachOf",
          "expressions": [
            {
              "type": "TripleConstraint",
              "predicate": "http://schema.org/givenName",
              ...
            },
            {
              "type": "TripleConstraint",
              "predicate": "http://schema.org/familyName",
              ...
            }
          ],
          "min": 0,
          "max": -1
        }
      ]
    }
  }
}

But if I validate against

<http://example.com/john> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
<http://example.com/john> <http://schema.org/givenName> "John" .
<http://example.com/john> <http://schema.org/familyName> "Doe" .

shex.js produces

{
  "type": "ShapeTest",
  "node": "http://example.com/john",
  "shape": {
    "term": "START"
  },
  "solution": {
    "type": "EachOfSolutions",
    "solutions": [
      {
        "type": "EachOfSolution",
        "expressions": [
          {
            "type": "TripleConstraintSolutions",
            "predicate": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
            "solutions": [ { ... } ],
            "valueExpr": { .. }
          },
          {
            "type": "EachOfSolutions",
            "solutions": [
              {
                "type": "EachOfSolution",
                "expressions": [
                  {
                    "type": "TripleConstraintSolutions",
                    "predicate": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
                    "solutions": [ { ... } ],
                    "valueExpr": { ... }
                  },
                  {
                    "type": "TripleConstraintSolutions",
                    "predicate": "http://schema.org/givenName",
                    "solutions": [ { ... } ],
                    "valueExpr": { ... }
                  },
                  {
                    "type": "TripleConstraintSolutions",
                    "predicate": "http://schema.org/familyName",
                    "solutions": [ { ... } ],
                    "valueExpr": { ... }
                  }
                ]
              }
            ],
            "min": 0,
            "max": -1
          }
        ]
      }
    ]
  }
}

The second EachOfSolution (the nested one) has an array of three expressions, where one is a TripleConstraintSolutions with predicate http://www.w3.org/1999/02/22-rdf-syntax-ns#type - which is copied/duplicated from the first EachOf expression (the outer / not nested one).

In fact the referenced objects are the same: result.solution.solutions[0].expressions[0] === result.solution.solutions[0].expressions[1].solutions[0].expressions[0].

If another solution to the nested EachOf is added - like validating against

<http://example.com/john> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
<http://example.com/john> <http://schema.org/givenName> "John" .
<http://example.com/john> <http://schema.org/familyName> "Doe" .
<http://example.com/john> <http://schema.org/givenName> "JOHN" .
<http://example.com/john> <http://schema.org/familyName> "DOE" .

then only the first nested EachOfSolution has three expressions, and the second nested EachOfSolution has just the two expected expressions.

Is this correct? I would have expected the expression tree in results to match the expression tree of the schema.

Complete demo code is here. Thanks again for all the work on ShEx! This isn't a critical bug for us or anything, just something I though I'd point out.

ericprud commented 4 years ago

Sounds to me like i've failed to initialize something and it's absorbing some earlier state. permalink

Many thanks for the bug report and especially the thorough analysis.