TopQuadrant / shacl

SHACL API in Java based on Apache Jena
Apache License 2.0
217 stars 61 forks source link

Thread safety test. copied validateShapes() to validateShapesThreaded… #138

Closed beaudet closed 2 years ago

beaudet commented 2 years ago

… and threaded the contents inside the shapes loop.

HolgerKnublauch commented 2 years ago

Ok I can see the validation report seems to skip certain triples. and it ends up with dangling blank nodes that are detached from the report instance. Probably a multi-threading issue with writing to the Graph. I am not sure whether Jena supports concurrent writes, but I guess the default memory graph does not.

afs commented 2 years ago

I am not sure whether Jena supports concurrent writes, but I guess the default memory graph does not.

Where is this happening?

Jena supports concurrent writes with locks, critical sections (very old) and transactions on datasets.

If you add multiple triples, without a lock, chaos will result even if the data structures are not corrupted.

HolgerKnublauch commented 2 years ago

Right, there is no locking code or transactions in place, because the code isn't expecting concurrent execution ATM.

beaudet commented 2 years ago

Are transactions needed even for a 100% read only scenario?

beaudet commented 2 years ago

Ah, when the validation report is created new nodes are added to the validation report graph so that's probably where transactions are needed?

afs commented 2 years ago

And any internal caching.

And passing values into the closure for executor submit need checking (less likely a problem but possible under load if there isn't a write-barrier).

Roughly - expect the worse. It will happen under load on a virtual machine where thread pauses of many seconds are quite possible.

HolgerKnublauch commented 2 years ago

The proper long-term refactoring here would probably be to not create the report triples directly but instead create an intermediate Java data structure (that is thread-safe). But that's a larger change. Meanwhile I think this PR can be closed?

beaudet commented 2 years ago

+1 on the suggested approach and to closing the PR.