eclipse-rdf4j / rdf4j

Eclipse RDF4J: scalable RDF for Java
https://rdf4j.org/
BSD 3-Clause "New" or "Revised" License
361 stars 165 forks source link

Adding a Model to a repository doesn't transfer its namespaces #2749

Closed Dzeri96 closed 1 year ago

Dzeri96 commented 3 years ago

When adding a Model to a Repository with conn.add(model), its namespaces get lost. Writing the repo to a file like

RDFWriter writer = Rio.createWriter(RDFFormat.TURTLE, outputStream);
conn.export(writer);

produces a very unreadable Turtle file with no prefixes.

Only after manually re-adding them like this

for (Namespace ns : ontologyNamespaces) {
    conn.setNamespace(ns.getPrefix(), ns.getName());
}

does the output actually contain prefixes. It definitely the act of adding the model and not the RDFwriter's fault.

If this is a bug, I'd be happy to contribute. If this is expected behavior (which is counter-intuitive in my opinion), is there a nicer way to include the namespaces when importing?

VERSION: 3.4.4

barthanssens commented 3 years ago

It's not really a bug IMHO, since the RepositoryConnection will use the add(Iterable<Statement>) method, and Statements themselves don't know about namespaces ... IIRC it's the same behavior when adding a Model to another Model. (Note that prefixes in Model/Repository A may clash with those in Model/Repository B)

Maybe @jeenbroekstra can tell if this was a deliberate design choice, or just something that stuck through the years.

abrokenjester commented 3 years ago

I suggest we call this a bug: it's not necessarily a bug in the sense that we never defined the behavior, but I do agree that it is counter-intuitive (and inconsistent when compared to adding data directly from a File or an InputStream), and I see no harm in fixing this as part of a patch release.

A simple fix would be to check (in the add method) if the Iterable also implements NamespaceAware, and if so, copy over namespace definitions.

To resolve potential clashes, for now we can just stick with the same behavior as when directly adding a file to a Repository (e.g. by supplying an inputstream): check if the repository already has a namespace mapping with the same prefix, and only if it does not, add the namespace from the model.

@Dzeri96 if you're interested in providing a fix for this, that'd be great! Before you dive in, have a look at our contributor guidelines, in particular the points on how to sign the Eclipse ECA. Let me know if you need any help with any of it.

Dzeri96 commented 3 years ago

If others agree, I'd like to work on this when I have some free time. ETA for a PR is mid-february if the issue is not too complicated

barthanssens commented 3 years ago

Sure, would be great if you could work on this, thanks.

abrokenjester commented 3 years ago

@Dzeri96 just checking in to see: are you still planning to work on this issue? We have a patch release coming up soon, would be good to have this included.

Dzeri96 commented 3 years ago

@jeenbroekstra I'm sorry I haven't started sooner, but as usual some things came up. I'm planing to start working on this in two days, though I can't estimate how long the actual work will take. I will keep you posted as soon as I have something worthwhile.

Dzeri96 commented 3 years ago

Once again, sorry everyone for taking so long. I've finally started getting to know the codebase and of course I let the tests run with mvn test, before changing anything. Unfortunately I've hit a snag, as one test fails due to a stack overflow:

[ERROR] serializableParallelValidation(org.eclipse.rdf4j.sail.shacl.SerializableTest)  Time elapsed: 64.477 s  <<< ERROR!
java.lang.Exception: Unexpected exception, expected<org.eclipse.rdf4j.sail.shacl.ShaclSailValidationException> but was<java.lang.RuntimeException>
    at org.eclipse.rdf4j.sail.shacl.SerializableTest.serializableParallelValidation(SerializableTest.java:144)
Caused by: java.util.concurrent.ExecutionException: java.lang.StackOverflowError
    at org.eclipse.rdf4j.sail.shacl.SerializableTest.serializableParallelValidation(SerializableTest.java:144)
Caused by: java.lang.StackOverflowError

For reference, I'm using openJDK 15.0.1. I also see a lot of missing maven dependencies, though that doesn't seem to be the problem (yet).

abrokenjester commented 3 years ago

@Dzeri96 apologies for not getting back to you sooner, for some reason I wasn't notified of your comment (or perhaps I just overlooked it in the flood of notifications Github occassionaly sends...).

I am not sure why you are getting that test failure, but unless you are doing something really exotic I have a hard time believing it's caused by anything you changed in the code. Feel free to just ignore the failing test for now at least - if you put up a (draft) Pull Request we can take a closer look (and also see what the CI has to say about test failures).

Dzeri96 commented 3 years ago

I'm sorry I didn't get back to you sooner. I don't wanna clutter the backlog but currently I can't work on this as I'm pretty busy with my master's thesis. If this issue is still alive when I have some free time again, I'll give it another go. It turns out the tests you set up were a bit too complicated for me ๐Ÿ˜…

abrokenjester commented 3 years ago

Not a problem @Dzeri96. If you need a second pair of eyes to figure it out don't hesitate to shout!

Dzeri96 commented 1 year ago

Sorry it took so long, but a promise is a promise. I graduated last week and I'm ready to finish this issue up. I updated main to the latest version and ran mvn test, but the failing tests still seem to be there. A snippet of the output looks like this:

[ERROR]   ShaclTest.testRevalidation:39->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testRevalidation$2:39->AbstractShaclTest.runTestCaseRevalidate:1003->AbstractShaclTest.testValidationReport:528 expected: <@prefix ex: <http://example.com/ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rsx: <http://rdf4j.org/shacl-extensions#> .
@prefix rdf4j: <http://rdf4j.org/schema/rdf4j#> .
> but was: <@prefix ex: <http://example.com/ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rsx: <http://rdf4j.org/shacl-extensions#> .
@prefix rdf4j: <http://rdf4j.org/schema/rdf4j#> .

[] a sh:ValidationReport;
  rdf4j:truncated false;
  sh:conforms false;
  sh:result [ a sh:ValidationResult;
      rsx:shapesGraph rdf4j:SHACLShapeGraph;
      sh:focusNode "1";
      sh:resultPath ex:info2;
      sh:resultSeverity sh:Violation;
      sh:sourceConstraintComponent sh:MinCountConstraintComponent;
      sh:sourceShape _:2086179301db4b8f8de6ffa0220975486049
    ], [ a sh:ValidationResult;
      rsx:shapesGraph rdf4j:SHACLShapeGraph;
      sh:focusNode ex:Person;
      sh:resultPath ex:info2;
      sh:resultSeverity sh:Violation;
      sh:sourceConstraintComponent sh:MinCountConstraintComponent;
      sh:sourceShape _:2086179301db4b8f8de6ffa0220975486049
    ], [ a sh:ValidationResult;
      rsx:shapesGraph rdf4j:SHACLShapeGraph;
      sh:focusNode "purple";
      sh:resultPath ex:info2;
      sh:resultSeverity sh:Violation;
      sh:sourceConstraintComponent sh:MinCountConstraintComponent;
      sh:sourceShape _:2086179301db4b8f8de6ffa0220975486049
    ] .

_:2086179301db4b8f8de6ffa0220975486049 a sh:PropertyShape;
  sh:minCount 1;
  sh:path ex:info2 .
>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[ERROR]   ShaclTest.testSingleTransaction:33->AbstractShaclTest.runWithAutomaticLogging:1152->lambda$testSingleTransaction$1:33->AbstractShaclTest.runTestCaseSingleTransaction:952 Expected validation to succeed ==> expected: <false> but was: <true>
[INFO] 
[ERROR] Tests run: 12315, Failures: 130, Errors: 0, Skipped: 4
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Eclipse RDF4J 4.2.1-SNAPSHOT:
[INFO] 
[INFO] Eclipse RDF4J ...................................... SUCCESS [  0.938 s]
[INFO] RDF4J: Assembly Descriptors ........................ SUCCESS [  0.529 s]
[INFO] RDF4J: Core ........................................ SUCCESS [  0.301 s]
[INFO] RDF4J: common ...................................... SUCCESS [  0.018 s]
[INFO] RDF4J: common annotation ........................... SUCCESS [  1.854 s]
[INFO] RDF4J: Model API ................................... SUCCESS [ 12.020 s]
[INFO] RDF4J: common exception ............................ SUCCESS [  0.658 s]
[INFO] RDF4J: common utilities ............................ SUCCESS [  0.617 s]
[INFO] RDF4J: common IO ................................... SUCCESS [  4.584 s]
[INFO] RDF4J: common iterators ............................ SUCCESS [  4.782 s]
[INFO] RDF4J: common text ................................. SUCCESS [  2.032 s]
[INFO] RDF4J: RDF Vocabularies ............................ SUCCESS [  1.815 s]
[INFO] RDF4J: Model ....................................... SUCCESS [ 12.603 s]
[INFO] RDF4J: common transaction .......................... SUCCESS [  2.157 s]
[INFO] RDF4J: common XML .................................. SUCCESS [  1.172 s]
[INFO] RDF4J: SparqlBuilder ............................... SUCCESS [  5.721 s]
[INFO] RDF4J: Rio ......................................... SUCCESS [  0.030 s]
[INFO] RDF4J: Rio - API ................................... SUCCESS [  6.360 s]
[INFO] RDF4J: Rio - Languages ............................. SUCCESS [  0.807 s]
[INFO] RDF4J: Rio - Datatypes ............................. SUCCESS [  2.576 s]
[INFO] RDF4J: Query ....................................... SUCCESS [  4.757 s]
[INFO] RDF4J: Rio - Binary ................................ SUCCESS [  3.524 s]
[INFO] RDF4J: Rio - N-Triples ............................. SUCCESS [  4.441 s]
[INFO] RDF4J: Rio - HDT ................................... SUCCESS [  3.068 s]
[INFO] RDF4J: Rio - JSON-LD ............................... SUCCESS [  6.834 s]
[INFO] RDF4J: Rio - Turtle ................................ SUCCESS [  7.666 s]
[INFO] RDF4J: Rio - N3 (writer-only) ...................... SUCCESS [  2.383 s]
[INFO] RDF4J: Rio - N-Quads ............................... SUCCESS [  4.606 s]
[INFO] RDF4J: Rio - RDF/JSON .............................. SUCCESS [  4.761 s]
[INFO] RDF4J: Rio - RDF/XML ............................... SUCCESS [  7.578 s]
[INFO] RDF4J: Rio - TriX .................................. SUCCESS [  3.022 s]
[INFO] RDF4J: Rio - TriG .................................. SUCCESS [  6.838 s]
[INFO] RDF4J: Query algebra ............................... SUCCESS [  0.016 s]
[INFO] RDF4J: Query algebra - model ....................... SUCCESS [  5.354 s]
[INFO] RDF4J: Sail ........................................ SUCCESS [  0.087 s]
[INFO] RDF4J: Sail API .................................... SUCCESS [  9.833 s]
[INFO] RDF4J: Query result IO ............................. SUCCESS [  0.027 s]
[INFO] RDF4J: Query result IO - API ....................... SUCCESS [  1.310 s]
[INFO] RDF4J: Test Suites ................................. SUCCESS [  0.057 s]
[INFO] RDF4J: QueryResultIO testsuite ..................... SUCCESS [  1.422 s]
[INFO] RDF4J: Query result IO - binary .................... SUCCESS [  2.907 s]
[INFO] RDF4J: Query result IO - SPARQL/JSON ............... SUCCESS [  4.419 s]
[INFO] RDF4J: Query result IO - SPARQL/XML ................ SUCCESS [  4.412 s]
[INFO] RDF4J: Query result IO - plain text booleans ....... SUCCESS [  4.523 s]
[INFO] RDF4J: Repository .................................. SUCCESS [  0.015 s]
[INFO] RDF4J: Repository - API ............................ SUCCESS [  5.303 s]
[INFO] RDF4J: HTTP ........................................ SUCCESS [  0.021 s]
[INFO] RDF4J: HTTP protocol ............................... SUCCESS [  3.182 s]
[INFO] RDF4J: HTTP client ................................. SUCCESS [ 15.546 s]
[INFO] RDF4J: Query parser ................................ SUCCESS [  0.022 s]
[INFO] RDF4J: Query parser - API .......................... SUCCESS [  2.437 s]
[INFO] RDF4J: Query parser - SPARQL ....................... SUCCESS [  9.040 s]
[INFO] RDF4J: SPARQL Repository ........................... SUCCESS [  4.814 s]
[INFO] RDF4J: Query algebra - evaluation .................. SUCCESS [ 13.734 s]
[INFO] RDF4J: Repository API testsuite .................... SUCCESS [  3.219 s]
[INFO] RDF4J: SailRepository .............................. SUCCESS [  4.309 s]
[INFO] RDF4J: Repository - event (wrapper) ................ SUCCESS [  3.016 s]
[INFO] RDF4J: HTTPRepository .............................. SUCCESS [  1.454 s]
[INFO] RDF4J: Repository manager .......................... SUCCESS [  4.850 s]
[INFO] RDF4J: Sail base implementations ................... SUCCESS [  4.198 s]
[INFO] RDF4J: Sail API testsuite .......................... SUCCESS [  1.901 s]
[INFO] RDF4J: MemoryStore ................................. SUCCESS [ 21.717 s]
[INFO] RDF4J: Query algebra - GeoSPARQL ................... SUCCESS [  5.162 s]
[INFO] RDF4J: Query Rendering ............................. SUCCESS [  4.819 s]
[INFO] RDF4J: DatasetRepository (wrapper) ................. SUCCESS [  1.040 s]
[INFO] RDF4J: Repository - context aware (wrapper) ........ SUCCESS [  3.067 s]
[INFO] RDF4J: Model API testsuite ......................... SUCCESS [  1.364 s]
[INFO] RDF4J: Sail Model .................................. SUCCESS [  2.833 s]
[INFO] RDF4J: NativeStore ................................. SUCCESS [ 46.704 s]
[INFO] RDF4J: Inferencer Sails ............................ SUCCESS [ 30.390 s]
[INFO] RDF4J: SHACL ....................................... FAILURE [03:01 min]
[INFO] RDF4J: LmdbStore ................................... SKIPPED
[INFO] RDF4J: Lucene Sail API ............................. SKIPPED
[INFO] RDF4J: Lucene Sail Index ........................... SKIPPED
[INFO] RDF4J: Solr Sail Index ............................. SKIPPED
[INFO] RDF4J: Elastic Search Sail Index ................... SKIPPED
[INFO] RDF4J: Extensible Store ............................ SKIPPED
[INFO] RDF4J: Elasticsearch Store ......................... SKIPPED
[INFO] RDF4J: SPIN ........................................ SKIPPED
[INFO] RDF4J: Client Libraries ............................ SKIPPED
[INFO] RDF4J: Storage Libraries ........................... SKIPPED
[INFO] RDF4J: Tools ....................................... SKIPPED
[INFO] RDF4J: application configuration ................... SKIPPED
[INFO] RDF4J: HTTP server - core .......................... SKIPPED
[INFO] RDF4J: SPARQL compliance test suite ................ SKIPPED
[INFO] RDF4J: Federation .................................. SKIPPED
[INFO] RDF4J: Console ..................................... SKIPPED
[INFO] RDF4J: HTTP server ................................. SKIPPED
[INFO] RDF4J: Workbench ................................... SKIPPED
[INFO] RDF4J: Runtime ..................................... SKIPPED
[INFO] RDF4J: Runtime - OSGi .............................. SKIPPED
[INFO] RDF4J: Spring components ........................... SKIPPED
[INFO] RDF4J: Spring boot component for a HTTP sparql server SKIPPED
[INFO] RDF4J: Spring ...................................... SKIPPED
[INFO] RDF4J: Spring Demo ................................. SKIPPED
[INFO] RDF4J: Rio compliance test suite ................... SKIPPED
[INFO] RDF4J: SHACL compliance test suite ................. SKIPPED
[INFO] RDF4J: Lucene Sail Tests ........................... SKIPPED
[INFO] RDF4J: GeoSPARQL compliance test suite ............. SKIPPED
[INFO] RDF4J: benchmarks .................................. SKIPPED
[INFO] RDF4J: Compliance tests ............................ SKIPPED
[INFO] RDF4J: Repository compliance tests ................. SKIPPED
[INFO] RDF4J: Rio compliance tests ........................ SKIPPED
[INFO] RDF4J: Model compliance tests ...................... SKIPPED
[INFO] RDF4J: SPARQL query parser compliance tests ........ SKIPPED
[INFO] RDF4J: SHACL compliance tests ...................... SKIPPED
[INFO] RDF4J: Lucene Sail Tests ........................... SKIPPED
[INFO] RDF4J: Solr Sail Tests ............................. SKIPPED
[INFO] RDF4J: Elasticsearch Sail Tests .................... SKIPPED
[INFO] RDF4J: GeoSPARQL compliance tests .................. SKIPPED
[INFO] RDF4J: Code examples ............................... SKIPPED
[INFO] RDF4J: BOM ......................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  08:52 min
[INFO] Finished at: 2022-10-20T22:23:58+02:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M6:test (default-test) on project rdf4j-shacl: There are test failures.

I'm a bit overwhelmed with the repository structure so I'd appreciate guidance in this area @jeenbroekstra .

hmottestad commented 1 year ago

Are you getting this build error on your machine without making any changes to the code?

Dzeri96 commented 1 year ago

Yes, this is on the latest state of this repository with no changes.

Dzeri96 commented 1 year ago

Update: It seems running the root target with mvn test is not stable, as different modules will fail in different runs, without me changing anything. For example, now a test in RDF4J: Federation fails with the message:

[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.201 s <<< FAILURE! - in org.eclipse.rdf4j.federated.MediumConcurrencyTest
[ERROR] org.eclipse.rdf4j.federated.MediumConcurrencyTest.queryMix  Time elapsed: 0.178 s  <<< ERROR!
...
[ERROR]   MediumConcurrencyTest.lambda$queryMix$0:70 ยป Execution org.eclipse.rdf4j.sail.SailException: Connection closed before all iterations were closed.

As expected, these unpredictable errors are tied to concurrency tests.

abrokenjester commented 1 year ago

Thanks again @Dzeri96 , nice work! PR merged and the fix will be available in the next RDF4J release.

Dzeri96 commented 1 year ago

Thank you for waiting while I finished my thesis. I'm not using RDF4J currently but when when I do again, I'll check out which bugs need fixing. The first PR is always the hardest ๐Ÿ˜….