w3c / rch-wg-charter

Charter proposal for an “RDF Dataset Canonicalization and Hash Working Group”
https://w3c.github.io/rch-wg-charter/
Other
12 stars 7 forks source link

Multi-serialization use case #84

Closed philarcher closed 3 years ago

philarcher commented 3 years ago

This UC came out of an email exchange with @msporny that was prompted by, and subsequently shared with Danbri. He, Danbri, encouraged me to turn that into a use case, which is what you see here. It's all about answering the "why can't you just sign the file you send?" question.

iherman commented 3 years ago

There are lots of overlaps between this new section and the separate note that has been added considering the same subject. Does this new entry makes that note obsolete?

msporny commented 3 years ago

There are lots of overlaps between this new section and the separate note that has been added considering the same subject. Does this new entry makes that note obsolete?

The use case bears pointing out and thank you to @philarcher for doing so. This is an important nuance that "my serialization is the right one" maximalists tend to keep missing (when every system you've ever built is built around one serialization, you don't tend to see the issue... it's only when you have to cross serializations... like from JSON -> CBOR... and you have to do so across multiple use cases that it becomes obvious to you that preserving the original serialization just for the purposes of digital signature verification is a non-starter.

There are two primary approaches to solve this problem:

  1. Create a bespoke solution that provides a round-trippable mapping from serializationA to serializationB for useCaseZ, or
  2. Create a general solution that provides a round-trippable mapping from serializationA to serializationB for all use cases.

The work of this WG is going to enable solution 2 above, while "my serialization is the right one" maximalists tend to reach for solution 1 for their use case.

philarcher commented 3 years ago

You are correct, of course, @iherman - it's the same point. However, it's sometimes necessary to say the same thing more than once in different places, especially when trying to satisfy real concerns from different people. Having it in a list of use cases helps to emphasise the point - but I grant you, it doesn't actually add anything that wasn't already in the doc.

iherman commented 3 years ago

Less so in the VC domain but in the "traditional" SW domain the role of triple stores is even more obvious. Most of the triple stores store the RDF data in some internal/binary format, in which case the serialization is "just" a serialization to get data in or out. And because they usually have an internal (and possibly random) way of generating BNode labels, canonicalization is the only way of handling data extracted from such databases.

(Not sure worth adding to the text. Maybe.)

pchampin commented 3 years ago

@philarcher I propose to close this, as it was superceded by #85.