apache / jena

Apache Jena
https://jena.apache.org/
Apache License 2.0
1.09k stars 646 forks source link

Fuseki to send `ETag` response headers #2305

Open namedgraph opened 6 months ago

namedgraph commented 6 months ago

Version

4.7.0

Feature

Graphs without bnodes can be safely hashed and the hash can be used as a strong ETag value. Graphs with bnodes can be hashed and the hash could be used as a weak ETag value?

Are you interested in contributing a solution yourself?

Perhaps?

namedgraph commented 6 months ago

Example of model hashing: https://github.com/AtomGraph/Core/blob/master/src/main/java/com/atomgraph/core/util/ModelUtils.java

afs commented 6 months ago

Hashing as a general solution doesn't extend to large graphs.

Transactions are serializable.

One way to generate a unique fingerprint would be to have UUID per database version.

Another way, with more information, is to have a UUID per dataset (fixed) then create the etag from that id+version and increment the version each write-commit.

Blank nodes don't matter. These tags are independent of content but unique to the version per server.

(does not work for replication with RDF Delta)

rvesse commented 6 months ago

Presumably this might require some new interface/API for Dataset implementations to allow them to expose this version information ?

With the variety of implementations that users use in the wild eg Text Indexing, General purpose datasets this would also need to carry through wrappers in some form

namedgraph commented 6 months ago

I was thinking ETag per graph as well. In other words, conditional requests supported by the Graph Store Protocol.

afs commented 6 months ago

Presumably this might require some new interface/API for Dataset implementations to allow them to expose this version information ?

ETags would be applied by Fuseki, possibly via the data service at the endpoint.

A UUID per (storage) dataset would be useful generally as being global and temporal unique.