zazuko / rdf-validate-shacl

Validate RDF data purely in JavaScript. An implementation of the W3C SHACL specification on top of the RDFJS stack.
MIT License
95 stars 12 forks source link

Validation error when using correct lexical value for xsd:gYear #100

Closed mightymax closed 1 year ago

mightymax commented 1 year ago

When instance data contains a xsd:gYear literal in where the year value contains more than 4 digits, and a SHACL PropertyShape uses a DatatypeConstraintComponent', the SHACL validation report incorrectly marks this value as an error: with this message:

Value does not have datatype <http://www.w3.org/2001/XMLSchema#gYear>.

The W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes defines a gYear with this lexical space, so literals like '10000' are valid gYear literal values:

yearFrag ::= '-'? (([1-9] digit digit digit+)) | ('0' digit digit digit))
gYearLexicalRep ::= yearFrag timezoneFrag?

I assume there are similar issues with other XSD date-like datatypes, for sure with gYearMonth (see in this example code), etc.

The issue is caused by a RegExp that only allows 4 digits for a year in the 'ref-validate-datatype' module, see here. I've created a pull request to implement the correct RexExp.

How to reproduce

import fs from 'fs'
import factory from 'rdf-ext'
import ParserN3 from '@rdfjs/parser-n3'
import SHACLValidator from 'rdf-validate-shacl'
import assert from 'assert'

async function loadDataset (filePath) {
  const stream = fs.createReadStream(filePath)
  const parser = new ParserN3({ factory })
  return factory.dataset().import(parser.import(stream))
}

const shapes = await loadDataset('shapes.ttl')
const data = await loadDataset('data.ttl')
const validator = new SHACLValidator(shapes, { factory })
const report = await validator.validate(data)
if (report.conforms === false) {
  console.error('Expected report to validate, it did not:')
  for (const result of report.results) {
    console.error(`'${result.message} on path ${result.path}`)
  }
}

Content of file data.ttl

prefix ex: <http://ex.com/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#> 
ex:Thing a ex:Thing; 
    ex:year "10000"^^xsd:gYear ;
    ex:yearMonth "10000-10"^^xsd:gYearMonth .

Content of file shapes.ttl

prefix sh: <http://www.w3.org/ns/shacl#>
prefix ex: <http://ex.com/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#> 

ex:Shape a sh:NodeShape ;
    sh:targetClass ex:Thing ;
    sh:property ex:gYearProperty, ex:gYearMonthProperty .

ex:gYearProperty a sh:PropertyShape ;
    sh:path ex:year ;
    sh:datatype xsd:gYear .

ex:gYearMonthProperty a sh:PropertyShape ;
    sh:path ex:yearMonth ;
    sh:datatype xsd:gYearMonth .
tpluscode commented 1 year ago

Published with v0.1.5 of rdf-validaate-datatype

mightymax commented 1 year ago

Thanx for fixing this on such short notice! Any insight on when a new package will be published?

tpluscode commented 1 year ago

Of course. 0.4.5, out now