Java integration test - Githubissues

berezovskyi commented 7 years ago

After #16, I got thinking how can I make a PR with a test derived from https://gitlab.com/assume/shacl-try-standalone/blob/master/src/test/java/se/kth/md/aide/ShaclJavaValidatorTest.java that would just test that shaclex works well (at least in basic terms) with Java.

Do you have any idea where can I start?

labra commented 7 years ago

Thanks, it would be great to include that.

In the case of issue #16 it was not a problem of Java-Scala interoperability, it was more a problem of dependency management and code duplication.

Anyway, I will keep #17 open for a while until I find a more idiomatic java interface that can be easier to use by Java programmers.

A good first step is to define some kind of Java interface similar to what you wrote and then add the Java code to <module>/src/main/java/... and <module>/src/test/java/... where < module> should be the module whose interface/tests we are adding.

If the interface is general enough, it could go directly under src/main/java or src/test/java.

I was also thinking about Java interoperability and @jadelkhoury's suggestion to use Jena Models and I will probably add an interface to support that. To that end, I need to define an RDF model for validation results and translate the result to that RDF model, which could later be returned as a Jena RDF Model.

In the next days, I will try to define a first version of that model and generate it so Java programmers could supply Jena Models and be returned Jena Models.

jadelkhoury commented 7 years ago

"To that end, I need to define an RDF model for validation results" I guess you mean the equivalent to a validation report for SHACL (https://www.w3.org/TR/shacl/#validation-report) but something that applies to ShEx?

labra commented 7 years ago

Yes, that's the idea. I am planning to define a more generic model that wraps the SHACL validation report and also serves for ShEx.

There are some differences between ShEx and SHACL on how to treat results:

SHACL is more focused on errors and when RDF is valid, it is almost silent. It can be a problem when there are some nodes that were not validated for some reason and the user will not know if they were valid or if they were just ignored. Also, SHACL spec was saying that the report should contain all errors, something that can be difficult to obtain if you have a very large dataset. My implementation of SHACL was defined in a way that it stopped at first error so it didn't generate the whole report. Also, it generates more information about which nodes were validated with which shapes.
ShEx is more focused on validation, so it generates more information about valid nodes (specially, which are the shapes of the nodes that have been validated) and it doesn't specify how errors should be returned.

So what I am planning to do is to see if I can define an RDF model for validation that includes information about the typing returned after validation (which nodes have which shapes and why) and also which errors have been obtained (which nodes don't have which shapes and why).

In ShEx, we thought about defining that in a standard way, but we decided it was better to keep it implementation dependent. Anyway, I will raise a issue about that to record that decision.

weso / shaclex

Java integration test #31