DID Document, Representation, and Representation-Entries data model?

msporny commented 3 years ago

At present, the DID Core specification only has one data model -- the one that's used to express DID Documents and their sub data structures (verification methods, services, identifiers, etc.)

This has always given a subset of the group heartburn -- @peacekeeper has wanted to separate representation-specific entries from the DID Document entries. There is a valid argument to do so, in order to not mix one set of map entries with another set of map entries.

There is also an open question related to: "What data model holds the data structures that you have after you process the input bytes during consumption?" For example, our current CBOR rules require you to preserve CBOR Tags, but we do a giant handwave over exactly how that's accomplished. It would typically be accomplished by working on the map data structures that you get after you do CBOR.decode(...) in most libraries.

So, this raises two questions:

Do we want to introduce the concept of a representation data model, which will be vaguely defined with no normative statements associated with it, but can answer the question of "What data model holds the data structures that you have after you process the input bytes during consumption?"
Do we want to introduce the concept of a representation-entries data model, which would hold representation-specific entries that could be used by any production process and would aid in cross-representation conversion w/o dumping representation-specific garbage into the DID Document data model?

OR13 commented 3 years ago

is probably DOA... I think we will not be able to agree to 1 (some of us think JSON-LD and JSON Schema are JSON, others don't).... I think we have beaten this topic to death, multiple special topic calls, many resolutions, including preserving unregistered properties, etc... I really don't want to re-litigate this issue... especially at the last minute... it feels like seeking consensus through exhaustion.
I can get behind 2 but only as long as it stays limited to production.... and its still dangerous / harmful complexity...

Keeping the conversation in 1 place, I am copying my proposal from the other issue below:

instead of thinking about production as a function of the ADM, let us think of it as a function of the ADM and the representation....

So for example:

const produce = (adm: infra<map>, representation: infra<map>) => {
  let didDocument = {...adm, ...representation};
  return serialize(didDocument);
}

the production rules can then apply normative requirements to both the adm and the representation, and their combination....

consumption remains the same, no need for "special buckets" or dangerous complexity.... if you consume JSON-LD and produce JSON, @context is preserved... if you consume JSON and produce JSON-LD you add the context during production and it never gets added to the ADM...

The same would apply to any representation specific properties... they are ALWAYS preserved on consumption, and ALWAYS required as a second argument of production.

I don't see a need for more than 1 bucket of entires in the ADM for a number of reasons...

The most compelling one is simplicity... as @selfissued would say, this spec should be easy for developers to implement... if we start making 3 buckets that all get used its worse than 2 which is worse than 1.

The spec as written today has only 1 bucket, and while that makes representation specific data models awkward... it also makes them very simple.... our default stance should be no change.

If we can gain consensus on a change, it should remove awkwardness without increasing complexity... in my mind the only way do to that is by altering the signature of production.... but let's look at the alternatives:

Option 1 (no change)

const produce = (adm) => Buffer
const consume = (Buffer) => adm

Option 2 (my proposal originally #662 )

const produce = (adm, representation) => Buffer
const consume = (Buffer) => adm

Option 3 (loosely matches @msporny 1)

const produce = (adm, representation) => Buffer
const consume = (Buffer) => adm, representation

produce: we agreed representation specific properties would be preserved, this will open the door to dropping properties again, by saying they need to be passed as arguments by callers... which will lead to implementation differences.... because callers will pass different things.... such as redefining terms, or adding junk that is not required....

consume: here we will fail to agree on what is representation and what is adm.... and so will all future implementers... this will lead to differences in what ends in the ADM... and undermine its usefulness.

Option 3.1

representation-entries is a property of some data strucuture.

const produce = (adm, representation-entries) => Buffer
const consume = (Buffer) => adm, representation-entries

Option 3.3

representation-options are a function argument

const produce = (adm, representation-entries, representation-options) => Buffer
const consume = (Buffer) => adm, representation-entries

Option 4 (for completeness)

const produce = (adm) => Buffer // this doesn't address objections
const consume = (Buffer) => adm, representation // this is the worst of both worlds imo.

My preference is for option 1, and maybe option 2....I think 3 is a really bad idea and undermines the entire point of a "big tent" / shared abstract data model.... I would need to see actual code to be convinced otherwise... we are past the point of text being convincing.

TallTed commented 3 years ago

@OR13 - The very long inline comments on Option 3 are hard to digest in this form. Would you please break those comments into multiple shorter lines?

peacekeeper commented 3 years ago

From my perspective, Option 3.1 and Option 3.3 would both work. In my mind it has always been like that anyway. There is an abstract data model that contains services, verification methods, etc. And then there are representation-specific things such as CBOR tags or @context that should not be mixed with the abstract (representation-independent) things.

Both Options would also allow @OR13 and @msporny to put @context into an application/did+json DID document (even though personally I think that's a bad idea that contradicts why we introduced the abstract data model concept in the first place).

msporny commented 3 years ago

PR #679 has been opened to address this issue. This issue will be closed once PR #679 is merged.

peacekeeper commented 3 years ago

Can this be closed now that https://github.com/w3c/did-core/pull/679 has been merged?

msporny commented 3 years ago

Can this be closed now that #679 has been merged?

Yes. :)

PR #679 has been merged. Closing.

w3c / did-core