TopQuadrant / shacl

SHACL API in Java based on Apache Jena
Apache License 2.0
217 stars 61 forks source link

tosh, dash etc. validation results without explicit shapes #135

Closed rmfranken closed 2 years ago

rmfranken commented 2 years ago

Hi there,

I noticed that I'm getting validation results from shapes that I have not defined in my shapes or data graph. The prefixes dash, tosh, swa and graphql show up, even though these are not defined in my shapes or data graph. I'm guessing they are somehow imported, but since my data and shapes graph only import skos and sh:, I don't see how. Just to double check, I also removed sh: from my imports in the shapes and data, but this did not make a difference. Plus, the SHACL validation shouldn't inference the imports before doing the validation, so it shouldn't matter anyway.

Is this a bug or a feature? If it's a feature, is there a way to turn it off - to prevent getting results in my validation report from shapes I have not explicitly defined in my shapes or data graph myself? I'm running the SHACL CLI 1.4.1 (somehow, I did not get these results in version 1.3.2. So I'm not sure if this is new, as I also updated my whole Java environment at the same time... Lesson learned: Never update 2 things at once...)

Cheers, Robin

afs commented 2 years ago

Commit 25a2fb9 - dash.ttl and tosh.ttl have changed a lot.

It happens when there are shapes in the data being validated which happens when using -datafile without -shapesfile (which is the same as having the same file for both -datafile and -shapesfile).

@rmfranken Is that how you are using the CLI tool?

A consequence is that the the WG tests don't produce the expect validation report when invoked from the command line.

HolgerKnublauch commented 2 years ago

Yeah I'll look into this on my Monday, but I believe there is a setting to filter out the system shapes in the validation engine. The tosh namespace defines many such shapes that we find useful in the tool but which may cause unexpected violations. I may need to switch them off by default, as other people have raised the same issue.

HolgerKnublauch commented 2 years ago

I have just added a flag for the command line tool that should help. Until now, validating shapes was always true, but it is now false by default. Those who still want to see those results can invoke the cmd line with -validateShapes as argument.

@rmfranken: Is this something you could confirm that it resolves your issue, without negative downsides? The tosh shapes should now be excluded. If you're not using the command line tools, see the commit above for the 3rd argument of ValidationUtil.validateModel on how to control this programmatically.

Once this is confirmed or looks right I plan to do another official release, also to address the log4j issue in case that plagues anyone.

rmfranken commented 2 years ago

@afs

Thanks for your response, that is indeed sort of how I'm using the CLI. Although my datafile and shapes file are not exactly the same, my data file does also include the same triples as are in the shapesfile, in addition to the instance data i'm trying to validate.

@HolgerKnublauch

Thank you for the quick response! I'm going to try to get it to work and let you know ASAP.

rmfranken commented 2 years ago

So, I'm a little lost. I followed the readme to the best of my ablities, (downloaded the latest release "The binary distribution is:

https://repo1.maven.org/maven2/org/topbraid/shacl/*VER*/shacl-*VER*-bin.zip.")

But this binary is not the same as what is upkept in this GitHub I guess. So when I download the repository (https://github.com/TopQuadrant/shacl/archive/refs/heads/master.zip), there is an src folder in there in which most changes are done looking at your commit, but I don't understand how this src folder is supposed to interact with the binary I downloaded. Or perhaps put differently, how does the commit above get implemented into my current environment? I only see a lib and bin folder in my shacl-1.4.1-bin, no src. Conversely, downloading the .zip of this repository doesn't have a lib or bin folder. How are these two supposed to integrate?

Feeling like a bit of an idiot here, but I don't understand how to implement your changes, probably entirely outside of your fault! I tried adding the folders + files from the repository into the binary folder, but that's probably (definitely, but I got despereate) not the right way to go. I've included my folder as a zip here. This folder also includes the test data,shapes graph, results and a batch script which runs the validator that is currently still generating 2 errors stemming from tosh: shapes.

shacl-1.4.1-bin_rf.zip

Kind regards, Robin

HolgerKnublauch commented 2 years ago

No problem. It requires running the maven build etc, so needs some more infrastructure. Would you be able to send me your example data with instructions on what violations you would expect? I could run them here in my machine and send you the output/validation report for you to confirm that there are no undesired violdations? holger@topquadrant.com

HolgerKnublauch commented 2 years ago

I have meanwhile received and reviewed the files from Robin, and they seems to work fine now. With the release of 1.4.2 I think this can now be closed. Please reopen if you disagree.