kid0003 Serialization Algorithms

SmithSamuelM commented 4 years ago

Please comment. This is the proposed implementation details for all the serialization of both events and extracted data from events. This supersedes the Ordered Mapping issue.

See. https://github.com/decentralized-identity/keri/blob/master/kids/kid0003.md

SmithSamuelM commented 4 years ago

With this nailed down there are no blockers to implementing the KERI events and validating events.

chunningham commented 4 years ago

I like this proposal, it provides a simple normalisation algorithm for event signatures. The extracted data set serialisation remains easy to implement for signing/verifying without concerns about wire transport/message encoding while the events can be serialised to ones desired encoding without concerns for signature operations and still be verifiable.

OR13 commented 4 years ago

{
  "vs"   : "KERI10JSON00011c_",
  "id"   : "AaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",  // qualified Base64
  "sn"   : "0",  // lowercase hex string no leading zeros
  "ilk"  : "icp",
  "sith" : "1",  // lowercase hex string no leading zeros or list
  "keys" : ["AaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM"],  // list of qual Base64
  "next" : "DZ-i0d8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CM",  // qualified Base64
  "toad" : "1",  // lowercase hex string no leading zeros
  "wits" : [],  // list of qualified Base64
  "data" : [],  // list of config ordered mappings
  "sigs" : []  // optional list of or single lowercase hex string(s) no leading zeros
}

I'd prefer not to see hex encoded values next to base64 ones...

Prefer distinct types (optional members lead to different types).

Prefer not to use ordered arrays (why not use objects )?

Structurally I think we should probably design the event schema in something like JSON Schema... because otherwise its hard to manage the concept of "JSON" types ... in many languages... For pretty much every string value... its type is probably not really "string"... its type is more likely some restricted subset of string (regex / restricted character set and length).

SmithSamuelM commented 4 years ago

KERI events are meant to to be compact and are rigid by definition. This is not meant to be easily extensible so the schema is rigid per version. This was an intentional protocol design tradeoff. Its not JSON first. Its supports JSON, Msgpack, and CBOR.

SmithSamuelM commented 4 years ago

Serialization requires ordering. So objects must be ordered and if there is no semantic content to the object other than ordering than its not merely less compact but less semantically pure.

Cryptographic material is large and unordered so Base64 is best

For small numbers, base64 because its a minimum of 4 bytes with padding is not small. Its also hard to read Small numbers are easily recognized in hex which is more compact and more universal than interest. Given that representations of arbitrary precision integers is not universal.

These are all careful design tradeoffs.

I get that they are not your design aesthetic. But given the amount of work already in place. Its not very helpful to merely argue for a different design aesthetic.

OR13 commented 4 years ago

Sorry if the feedback came across harsh :)

I sat down to try and implement KERI yesterday, and found myself reading the original paper to find definitions for things.

I've struggled with a lot of these same design issues in sidetree, and so I would like to back up and state the opportunity as I see it, and provide more context for my suggestions.

There are a couple Layers which KERI separates well, which I am interested in.

Abstract Data Model for Event Sourced DIDs
Concrete Serialization of Events (including event types)
Abstract Ledger Interface (KEL)
Concrete implementation of ordered serialized events on a ledger.

In sidetree, we have 1 spec that covers all these things... and we also had to make aesthetic tradeoffs I'm not happy with :)

What I would like to do, is understand how to get "KERI-ness" without the particular aesthetics you have chosen.

As a thought experiment, what would KERI look like if all we were allowed to use was JOSE / JCS / GIT?... What would it look like if all we were allowed to use was IPLD and JOSE?

It should be possible to get the same properties, regardless of the aesthetic... In sidetree, we have this same tension, and we usually address it by distinguishing between "DID Method Specific" and "Sidetree Protocol"... but I'm starting to think we made a mistake with those buckets, and I worry we are making the same one here...

Instead of having KERI be a term that applied to so many things, including a paper and a concrete serialization, what if we split it up into layers like:

KERI Abstract Data Model
KERI Abstract Event System
KERI Abstract Ledger System
KERI Abstract Information Theoretic Properties Spec (why KERI is great, regardless of how you implement it).
KERI CORE data model v1
KERI CORE event system v1
KERI CORE ledger system v1

(5, 6, 7) would be 0 changes to the paper as written today, and the data model as documented in KIDs.

(1,2,3,4) would show why KERI is valuable, and how it can be implemented for various choices of key representation, hashing algorithm, canonicalization, serialization and ledger interfaces....

What I have seen in Sidetree is that we lost control of (1,2,3,4) when we started focusing on (5,6,7)... As a contributor to KERI, I am way more interested in making sure (1,2,3,4) are clearly documented / visualized, and as you noted, I have my own design aesthetic, which I am eager to apply to KERI.... I want to make sure I can do that while citing the authoritative sources for the abstract bits...

In other words, if we can enable direction contribution to (1,2,3,4) without needing to talk about (5,6,7)... I can see a way to align things like did:peer and sidetree easier... if we can't... then... thats going to be much harder to do.... because there is no separation between the abstract value that KERI provides, and the specific implementation choices we are making in DIF... I have seen this cause problems :)

SmithSamuelM commented 4 years ago

@OR13 I agree we need to document by splitting up into an organization as you have suggested. As we discussed in the meeting today, the priority for now is getting the ref imp done.

SmithSamuelM commented 4 years ago

I think this is better applied to an issue on documentation not specifically the serialization algorithm.

decentralized-identity / keri

kid0003 Serialization Algorithms #14