w3c / rch-wg-charter

Charter proposal for an “RDF Dataset Canonicalization and Hash Working Group”
https://w3c.github.io/rch-wg-charter/
Other
12 stars 7 forks source link

Skolemisation as a use-case? #27

Closed aidhog closed 3 years ago

aidhog commented 3 years ago

The RDF 1.1 spec describes the option of replacing blank nodes with (Skolem) IRIs. It seems like it could be a pretty direct use-case of canonicalisation: to compute deterministic Skolem IRIs to replace blank nodes with. Given the same RDF graph/dataset (or an isomorphic copy), different "Skolemisers" based on a canonical form could produce the same Skolem IRIs without coordination. This would be a practical way to remove blank nodes entirely from an eco-system, mint dereferenceable IRIs automatically, etc. (The fact that Skolem IRIs were included in the RDF 1.1 spec would also seem to suggest that they were deemed important.)

iherman commented 3 years ago

This would be a practical way to remove blank nodes entirely from an eco-system, mint dereferenceable IRIs automatically, etc.

While yes, using the canonicalization leads, indeed, to deterministic Skolem IRIs, and that is a good thing, I am a bit skeptical about this statement. Any change on the original graph may lead to dramatically different Skolem IRIs, which means that it would not necessarily help in many of the issues related to blank nodes (e.g., to see if graph A is a subset of graph B). It may also create problems if graph A changes, for example.

(The fact that Skolem IRIs were included in the RDF 1.1 spec would also seem to suggest that they were deemed important.)

I must admit I do not remember all the details of the discussion back then, but I vaguely remember long discussions whether Skolem IDs would indeed eliminate the need of blank nodes. The fact that both Skolem IDs and blank nodes are part of the standard seems to indicate that it blank nodes are still here to stay...

However. For this charter I think it would indeed be great to have a use case that is based on canonicalization paired with Skolem IDs. Do you think you can come up with a use case in a PR?

aidhog commented 3 years ago

Any change on the original graph may lead to dramatically different Skolem IRIs, which means that it would not necessarily help in many of the issues related to blank nodes (e.g., to see if graph A is a subset of graph B). It may also create problems if graph A changes, for example.

This is true (but I guess it would also be true of ad hoc methods for Skolemising IRIs).

Do you think you can come up with a use case in a PR?

Sure thing! I can try to do this later today.

aidhog commented 3 years ago

Added proposed text as PR #28.