w3c / did-core

W3C Decentralized Identifier Specification v1.0
https://www.w3.org/TR/did-core/
Other
410 stars 97 forks source link

How should resolvers handle the accept header? #417

Closed OR13 closed 3 years ago

OR13 commented 4 years ago

https://github.com/decentralized-identity/universal-resolver/issues/150

^ the universal resolver is one example of a resolver, I would like to establish working group consensus for what should happen in the following cases:

  1. application/did+json is requested but did method only supports application/did+ld+json.

  2. application/did+ld+json is requested but did method only supports application/did+json.

  3. */* is requested but did method only supports application/did+ld+json.

  4. */* is requested and did method supports application/did+ld+json and application/did+json.

  5. application/did+ld+json, application/did+json is requested and did method supports application/did+ld+json.

  6. application/did+json, application/did+ld+json is requested and did method supports application/did+ld+json.

iherman commented 4 years ago

Shouldn't we have a closure on the media type issue first (#208)? application/did+ld+json may look very different in fact...

OR13 commented 4 years ago

@iherman I completely agree, but since I'm reviewing PRs in the universal resolver that are assuming application/did+ld+json... I feel we need to start having this conversation now.

In many ways, this might be a better way to approach the questions about the ADM / Producer / Consumer issues for representations, its possible if I understood how we plan to handle this issue, I might not object to as much of the ambiguity in the ADM.

@jricher can you add your thoughts on this? I know you worked on the did resolution spec, I think most of my concerns with representations are actually just variations of not understanding how you, @talltree and @peacekeeper see this issue being addressed.

peacekeeper commented 4 years ago

I'd like to make one subtle meta-comment about this topic.. DID resolvers do not have a dependency on HTTP. Even though the Universal Resolver and some other implementations expose the DID resolution function at an HTTP endpoint, this isn't a requirement for resolvers.

The DID Resolution spec defines an HTTP(S) Binding, which maps the inputs and outputs of the abstract resolveStream() function to an HTTP request/response. This binding and other implementation details are out of scope for DID Core.

For this reason, we shouldn't be talking about an "Accept header" here in DID Core, but about the "accept" input metadata property.

Having said that, all your questions about the different cases are still valid and very interesting, and I agree this could be a helpful approach to the representation/producer/consumer topics!

peacekeeper commented 4 years ago

@OR13 I feel uneasy about the language "did method only supports XXXXX". We shouldn't be thinking that representations such as JSON-LD, JSON, CBOR have to be explicitly supported by each DID method.

Can you give an example of a DID method that only supports a single representation?

OR13 commented 4 years ago

@peacekeeper every single did method that I have worked on supports JSON and JSON-LD.

I assume that everyone agrees that modifying a did document after resolution is not the same as producing a did document...

did:github, did:elem, did:key, did:meme, they all produce JSON.

The JSON currently contains an @context, which according to todays spec, means they all only support JSON-LD.

My point about representations is that if the did core wg accepts my changes in:

https://github.com/w3c/did-core/pull/396

All those did methods would support both JSON and JSON-LD.

To be clear, if the changes in #396 or similar are not accepted, I believe the spec conformant approach for a request for:

application/did+json for all those did methods would be to throw an HTTP 406 OR delete the context before returning the resolution result and return a 200.... thats assuming the JSON representation does not introduce additional transformations which might be needed for it to be acceptable.... if that happens, I'm even less likely to use application/did+json

I'm not going to be implementing @context deleting to support JSON only, I will just not support it, if the spec stays the way it is.

No DID Method author can be forced to support all representations.

@peacekeeper since I have provided multiple methods that only support JSON-LD, can you provide a single example of a DID Method that only supports JSON?

peacekeeper commented 4 years ago

If a DID method only uses Core Properties (which I think is the case for most of the methods we have today), then those DID methods support both JSON-LD and plain JSON.

If a resolver implementation is asked for an application/did+json DID document and it returns a JSON document with an @context, then that resolver didn't follow the JSON production rules correctly.

Your did:github, did:elem, did:key, did:meme methods also support both application/did+ld+json and plain application/did+json.. Whether or not certain resolver implementations for those methods have correctly implemented the plain JSON representation, that's another question.

OR13 commented 4 years ago

@peacekeeper consider the resolution function defined in did core today: https://w3c.github.io/did-core/#did-resolution

who implements the resolveStream method... ?

The functions are abstract, the did method implements them.

reading the spec today, I assume that, if I ask a concrete did method to resolveStream for application/did+json and I get back json with an @context I can conclude the did method does not conform to did core (under todays normative language, which is imo unnecessarily hostile to common sense JSON development patterns).

If I get back an error saying "representation not supported" I can conclude the did method does not support that representation.

The did method implements the resolveStream function, and the did method is not required to support all representations...

for resolve, its less clear, since there is no accept header / it is ignored... so the did method gets to return whatever it wants....

if the did method returns json, with and @context with content type application/did+json, thats not valid according to the did core spec.

if the did method returns json, with @context with content type application/did+ld+json, thats valid according to the did core spec.

If the did method only ever returns application/did+ld+json, the did method does not support application/did+json... a resolver (like the universal resolver), can hack this, by just deleting the @context and returning the result.... but thats a separate issue, and raises questions about whether resolvers that mutate the result of resolve or resolveStream should be trusted.... IMO, they clearly should not be trusted.

peacekeeper commented 4 years ago

if I ask a concrete did method to resolveStream for application/did+json and I get back json with an @context I can conclude the did method does not conform to did core (under todays normative language

I would argue that even if we merge your PR https://github.com/w3c/did-core/pull/396, such an implementation of resolveStream would still not conform to DID Core, since for some reason it added a @context even though that's not part of the application/did+json production process.

a resolver (like the universal resolver), can hack this, by just deleting the @context and returning the result.... but thats a separate issue, and raises questions about whether resolvers that mutate the result of resolve or resolveStream should be trusted.... IMO, they clearly should not be trusted.

I see this differently. You consider it a hack to delete the @context after calling resolveStream().

But in my mind, what's happening here is not mutating the result of resolveStream(). Instead you are talking about a purely internal implementation detail inside the resolveStream() implementation. Yes, if a resolver receives a request for application/did+json, then one way of satisfying the request could be to obtain application/did+ld+json first, then run the JSON-LD consumption followed by the JSON production processes, which results in application/did+json (which now doesn't have a @context); and then this is the result that resolveStream() returns.

In other words, the resolver went through an intermediary processing step (what you call a "hack") that involved a application/did+ld+json DID document. Of course it would be preferable if resolvers supported all representations "natively" without having to convert between them.

OR13 commented 4 years ago

resolveStream is an abstract function implemented by did methods, it is not actually referenced anywhere but did resolution... resolveStream is not in any way related to DID Document production or consumption....

IMO, it goes like this:

ADM -> DID Method Internal Representation -> Supported DID Core Representations -> resolveStream.

ADM -> DID Method Internal Representation -> Supported DID Core Representations -> resolve.

Note that the did method gets to decide how to handle representations, if the did method wants to return JWK, it can decide to do that for JSON, and base58 for JSON-LD... and CWK for CBOR....

or the DID Method might decide, I only return JSON, never JSON-LD, and never CBOR.

You ask a DID Method for representations... sometimes, you ask an http service (like the universal resolver) to ask a did method for representations... whatever the did method produces is what you get.

DID Methods cannot be forced to produce DID Documents that conform to a representation production rules, but it is testable if they fail to meet the normative requirements for a representation.

When I ask a did method to "produce" application/did+json... I can test what I get back, if it contains forbidden properties, or does not contain required properties, I know the producer is not conformant... I can ONLY ask for application/did+json by using resolveStream.

Perhaps the problem lies in the fact that accept is not required in resolve, and it's not clear if "resolvers" are using resolve or resolveStream....

the short answer to this question is that IF a resolver using resolve accept is ignored and no content-type is returned (according to todays spec text).

I feel thats likely to result in nobody actually using resolve and everyone using resolveStream.

TL;DR; I think the resolution section of the spec is pseudo broken, and I am in favor of removing resolve from the spec.

dlongley commented 4 years ago

There will be consumers of DID Documents that do not pass data through the ADM but instead work with the data in its native format or in some other abstract format (the result of JSON.parse, for example), without invoking additional code to parse it into the ADM. These consumers are also able to become producers without going through the ADM, return valid DID Documents results via resolve. We need to be careful not to be so prescriptive that we tell implementers they must always go through the ADM -- because they don't necessarily have to and all that matters is that the output of resolve be valid, etc.

OR13 commented 4 years ago

We need to be careful not to be so prescriptive that we tell implementers they must always go through the ADM -- because they don't necessarily have to and all that matters is that the output of resolve be valid, etc.

I think this is the area of highest confusion.

I think of "what other people do with the result of a resolve/ resolveStream as orthogonal issue, that is 100% not related to "producer" / "consumer" statements, but I look forward to getting clearer on this.

peacekeeper commented 4 years ago

I think resolveStream should be renamed to resolveRepresentation. (see https://github.com/w3c-ccg/did-resolution/issues/57)

dlongley commented 4 years ago

@OR13,

I think this is the area of highest confusion.

I think of "what other people do with the result of a resolve/ resolveStream as orthogonal issue, that is 100% not related to "producer" / "consumer" statements, but I look forward to getting clearer on this.

Yes, I think the "producer"/"consumer" language is being interpreted differently by different people -- and it's creating some problems. It's becoming clearer to me that it should only be understood in the context of transformations.

OR13 commented 4 years ago

On the call, I heard that "production" happens in resolveRepresentation / resolveStream, and is implemented by the did method, and that anything that happens after that, like conversion to yaml or "other representations", is not "production" its "translation"... I think this is a helpful distinction to draw...

I wonder, where does consumption happen? I assume that any software library can implement consumption (including 3rd party libraries other than the did method itself).

Such libraries are then capable of "producing again", via "resolveStream" even though they are not the DID Method.

examples of such libraries are: "wallets", "agents", "http-resolvers"....

However, when they produce, it is only because they have consumed.... where as the original did method production, may have had no "consumption" or if it did, it doesn't look like "consuming a known representation".

concrete example from sidetree:

sidetree json files -> ipfs -> hash on ledger.

"resolveStream" -> get files -> convert to operations -> reduce operations to "internal representation" -> externalize "internal representation" -> JSON / JSON-LD DID Document.

When an http resolve consumes the result of the above, they are not handling any "internals", yet they are still capable of producing "translations" of a representation.

Since this working group is not about defining "wallets, agents or http resolvers", I suggest we limit "production and consumption" to the what a did method produces, and how a consumer library consumes... and NOT what "wallets, agents or http resolvers, or other consumer libaries" "produce"..... we should probably stop saying anyone but the did method "produces", and use the word "translate" for when a library consumes a known representation and "translates" it to another known representation.

This still allows us to do the following:

  1. Define the DID Document model using an abstract data model and a set of rules / constraints.
  2. Define DID Method Production for each representation
  3. Define DID Library Consumption for each representation

What would be out of scope would be subsequent use of "DID Library Production" which includes things like agents and wallets that transform or translate representations as part of working with other APIs, or resolvers that transform or translate did method representations which are not supported by the did method, where the did method did not implement support for what was asked for with the accept header when resolveStream was called.

This makes the concept of "round tripping / direct conversion between representations" a function of "consumption" + "translation"... because anyone can do it, not just the did method.

Only a DID Method can "produce" a did document from an ADM. Any software can "consume" a representation and obtain an instance of a did document represented via the ADM. Any software holding an instance of a did document represented via the ADM can "translate" to any known representation.

If you want a representation, you must ask for it by name from the did method using resolveStream.... thats the only scenario where "production" is happening, anything else is "translation".

OR13 commented 4 years ago

we can define:

(only did method): production = (adm, content-type) -> representation (any software): consumption = (representation, accept) -> adm (any software): translation = (representation, fromContentType, toContentType) -> consumption -> production

note that "translation" operates on "representations", whereas "production" operates on "adm instances".

peacekeeper commented 4 years ago

and that anything that happens after that, like conversion to yaml or "other representations", is not "production" its "translation"... I think this is a helpful distinction to draw...

I disagree with this view. Any time you see a DID document in one of the representations, "production" has happened, either by a DID resolver or some other software component. I agree that wallets, agents, etc. are out-of-scope for us, but I don't really get what would be the difference between "DID Method Production" and "DID Library Production"?

translate did method representations which are not supported by the did method

I think we should get rid of the idea that DID methods have to explicitly support each representation. The way how Sidetree handles JSON/JSON-LD internally to me feels like an implementation detail, this doesn't mean that the DID method itself doesn't support other representations.

Only a DID Method can "produce" a did document from an ADM.

I disagree with this, see above.

any software): translation = (representation, fromContentType, toContentType) -> consumption -> production

I don't understand, doesn't this contradict what you just wrote above? If you define "translation" as consuming and then producing (going via the ADM), then I agree with that.

OR13 commented 4 years ago

I don't really get what would be the difference between "DID Method Production" and "DID Library Production"?

A DID Method produces representations by directly leveraging the VDR / did method code.

A DID Library "produces" representations by consuming a serialization and then producing another one.

DID Libraries are not the "authoritative source" for what a DID Document for a given did method is... In KERI, Sidetree, DID Peer, HyperLedger Indy... all these systems have code that currently processes internal data, and constructs an ADM from it....

The Universal Resolver, Cloud Agents, Wallets, are not looking at Ethereum transactions, or IPFS hashes... they are not operating on the VDR.... they are operating on concrete representations, produced by a DID Method.

Its possible you don't agree with the semantics, or naming these things separately, but you must agree that there is a difference between what a did method does when it constructs a did document for the first time ever, and what an http resolver does, when it consumes and produces a representation of that document for the first time?

I propose naming the difference to avoid ambiguity.

any software): translation = (representation, fromContentType, toContentType) -> consumption -> production

this is clearly not precise enough, but note that (consumption -> production) is really:

(consumption -> ADM -> production)

You cannot consume an ADM, or VDR event data, according to the spec today... so these things are not equivalent, in my reading of the spec:

(proposed production definition): VDR -> ADM -> production -> representation

(proposed consumption definition): representation -> ADM

(proposed translation definition): representation -> ADM -> representation

Conflating "translation" and "production" leads to problems understanding "cryptographic verifiability of the identifier"... you are NOT trusting software to do that when you are "translating".... you ARE trusting software to do that when you are "producing".

There is a difference between processing a KERI event log and producing a did document, and copying a did document and converting it to yaml.... one is production, the other is translation... calling them the same name is confusing.

peacekeeper commented 4 years ago

you must agree that there is a difference between what a did method does when it constructs a did document for the first time ever, and what an http resolver does, when it consumes and produces a representation of that document

Yes there's a difference, but to me that difference has to do with trust and verifiability, not with the ADM, representations, production, and consumption.

  1. If you process a KERI event log and produce a JSON DID document -> that's "production".
  2. If you run curl -H 'Accept: application/did+ld+json' 'https://dev.uniresolver.io/1.0/identifiers/did:key:z6Mkfriq1MqLBoPWecGoDLjguo1sB9brj6wT3qZ5BxkKpuP6' and then use a local library to "translate" the JSON-LD document to CBOR or YAML -> that's also "production".

Of course 1. can be trusted, and 2. cannot be trusted. We tried to cover some of this in DID Resolution Architectures.

But in terms of the representation consumption and production processes I see no real difference between 1. and 2.

OR13 commented 4 years ago

@peacekeeper I don't like using the same term for 2 processes with varying security / trust implications, that feels unsafe to me.

we could augment the word production to be "did-method-production" and "did-library-production", and then define production as either....

But I still don't like the idea of putting "arbitrary third party software" on the same level as "did method software".

Can you propose a change to the language that doesn't obscure the trust / security difference between 1 and 2?

peacekeeper commented 4 years ago

I think "DID library (something)" won't work very well because it sounds very implementation-specific. What if it's not a library, but a piece of hardware, or an operating system function, etc.

Maybe somehow use the word "Read" to describe the difference? "Read" is one of the four CRUD operations, and it's only ever used to obtain the DID document as described by the DID method, but it's never used again later when the DID document is consumed/produced/translated/sent-over-HTTP/etc.

OR13 commented 4 years ago

hmm yes, i agree I think READ may help....

let my try with it:

Let "DID Document Production" be the process whereby a "Read" is used to obtain an ADM DID Document instance, and then resolveRepresentation is applied to the ADM DID Document to produce a concrete representation such as application/did+json or application/did+cbor.

Let "DID Document Consumption" be the process of converting a DID Document instance in a supported representation to an Abstract Data Model DID Document instance.

Let "DID Document Translation" be the process whereby a DID Document instance is first consumed, converting the DID Document into the abstract data model, and then resolveRepresentation is applied to the ADM DID Document to produce a representation such as application/did+json or application/did+cbor.

The first thing I notice is that we seem to have confusion at a number of layers....

We need separate names for all these things:

  1. (did:string) -> ADM (today I think we call this READ).
  2. (did:string, options) -> did document in a representation (today we call this resolveStream).
  3. (didDocument: ADM, options) -> did document in a representation (today we call this produce).
  4. (didDocument: representation) -> didDocument: ADM (today we call this consume).

I think this pain is caused by a failure to fully integrate the VDR, DID Method, DID Resolution and Representations sections of the spec...

I suggest we do a better job of unifying these concepts.

Let READ be the abstract function where (did, VDR) -> didDocument:ADM

Let produceRepresentation be the abstract function where (didDocument:ADM, opts) -> didDocument:ConcreteRepresentation

Let consumeRepresentation be the abstract function where (didDocument:ConcreteRepresentation) -> didDocument:ADM

Let resolveRepresentation be the abstract function where (did:string, opts) -> didDocument:ConcreteRepresentation

So we can now see that the only way resolveRepresentation can work, is that at least 1 READ and some number of consumeRepresentation, produceRepresentation has happened.... in the safest most direct path.

resolveRepresentation -> READ -> produceRepresentation

In the case of universal resolver:

resolveRepresentation -> READ -> produceRepresentation -> consumeRepresentation -> produceRepresentation

We can now ask the question about why should (consumeRepresentation -> produceRepresentation) take a "preserve by default" approach... because otherwise we are handing the keys to "DID Document Structure" from Methods to "any other software system".

peacekeeper commented 4 years ago

I agree we should improve the alignment between the spec sections on the verifiable data registry, DID method, DID resolution and representations.

Maybe it helps to work on a few more diagrams..

OR13 commented 4 years ago

@peacekeeper a picture really is worth a thousand words! thanks for this.

If you can add one more layer to show "translation" I think we can get rid of the confusion regarding translation...

For example, the client when holding "JSON-LD" is capable of "translating" to JSON / CBOR.

Of course translation by a third party is different than production by the first party, even if they use the same abstract functions / rules.

This picture is awesome.

TomCJones commented 4 years ago

not sure i understand discussion. Mime types are handled by HTTP in application header. If not using HTTP where would the application/*** go? Is there a spec for DID absent HTTP for handling Mime types?

OR13 commented 4 years ago

@TomCJones HTTP is NOT assumed, yet a lot of the interfaces we have defined look a lot like HTTP, and we know that they interfaces WILL be exposed over HTTP, so we are trying to get things lined up as best as possible.

https://w3c.github.io/did-core/#did-resolution-metadata-properties https://w3c.github.io/did-core/#did-url-dereferencing-input-metadata-properties

peacekeeper commented 4 years ago

@OR13 I now tried to draw what you said, i.e. the client "translates" between representations (although I think a more realistic scenario would be "translation" by a hosted resolver service, rather than a client, but that shouldn't really matter for the discussion).

Screenshot from 2020-10-08 22-11-10

peacekeeper commented 4 years ago

Also, I've been thinking about your point that production during the initial "read" operation is different from translation. I'm not sure if I agree with that, but it makes me wonder if the initial "read" operation really works like this:

Screenshot from 2020-10-08 22-14-31

Or like this:

Screenshot from 2020-10-08 22-14-33

I.e. does "read" return the ADM, or a representation? It's clear that some DID methods have an internal preference for a certain representation (e.g. did:v1 -> JSON-LD), whereas other DID methods are completely neutral about representations (e.g. did:key).

OR13 commented 4 years ago

I think the first one is more accurate, or at least helpful, in the sense that it shows the difference between READ, resolve and resolveRepresentation.

In the case that a software system is simply "assembling json" it might look like the second scenario, but I think having the ADM and the language we have, precludes the second scenario from being true....

now regarding your picture which shows "translation"... what it really shows is an attack on the integrity of the did document which was first produced.

for any serialization that can be hashed, we should strive to preserve that it hashes to the same value (because of content addressing / caching / lookups and general developer sanity).

In this case, consuming JSON-LD and producing JSON, breaks the hash... and for what? all other properties are preserved.

There is 0 reason that consuming JSON-LD and producing JSON should drop ANY json members.

dropping JSON members harms interoperability, increases developer burden, and goes against "the spirit of json" which is "if you don't understand it, ignore it".

A better example would be to include XML, which is not capable of representing @context because @ is not a valid key. in that case, its pretty clear the property should be dropped.

Here is a simply algorithm which I believe should be used in "translation":

For each properties of REP_1:

  1. Consume the property.
  2. Produce the property in REP_2, if error (cannot produce because of the parser (not did core) rules of the representation) drop property and move to next property.

This would apply equally to CBOR integer indexes and @ in property names.

At the end, if the property can be preserved, it is.

if the did documents can be equivalent ( JSON vs JSON-LD ) they are.

Obviously if JSON added a bunch of objects which are not valid RDF, the resulting JSON-LD might contain unknown properties / other issues.... handling the rules of JSON-LD production is related to the rules of JSON production. IMO, if you cannot produce JSON with @context you cannot produce JSON-LD... because JSON-LD is JSON.

The current language is trying to define JSON as somehow not a superset of JSON-LD, yet simultaneously allow it to represent things that are not RDF... it makes no sense.

JSON is the more flexible superset of JSON-LD... to say that a superset is incapable of representing a subset is confusing and logically, not correct.

peacekeeper commented 4 years ago

A better example would be to include XML, which is not capable of representing @context because @ is not a valid key.

Even if it was a valid key, or if you escape the @ character somehow, the attempt to include a JSON-LD specific construct in XML makes no sense. Just like this YAML example makes no sense:

https://github.com/transmute-industries/did-core/blob/master/packages/did-representation-examples/jsonld/didDocument.yml

Why would someone want to add a JSON-LD specific construct to a YAML file? What does this mean, how would it be processed? You also don't add CBOR-specific constructs to JSON, or YAML-specific constructs to XML, etc.

Representation-specific constructs such as @context, which are not properties of the DID subject, should be encapsulated by the production/consumption rules of that specific representation and not show up in other representations.

OR13 commented 4 years ago

@peacekeeper the answer is that "translation" is not equal to "production".... I can translate a book from english to latin... but its not the same as writing a book in latin.

I agree that it doesn't make sense to include $schema in XML... unless you intend to later translate it back to JSON, and use it... which is why its relevant to translation and not production.

peacekeeper commented 4 years ago

@OR13 I think the book analogy supports my perspective more than yours :)

  1. If I translate a book from English to Latin, then the translated version doesn't contain any English words. All the English words got "dropped".
  2. A reader won't be able to tell if the book was translated from English from Latin, or if it was written in Latin in the first place.
OR13 commented 4 years ago

@peacekeeper interesting... have you tried using:

Can you imagine how absolutely terrible all "data model translation tooling" would be if it took your approach?

I agree with you, software is not spoke language... dropping data fields is something the library implementer should be deciding, because they are the ones that know if the fields are important or not.... thats why all those libraries i just listed preserve everything by default as part of "translation / conversion"....

I am not opposed to going XML -> JSON, then allowing a developer to delete some properties from JSON.....

I am opposed to saying "XML->JSON" => certain properties are dropped, read the spec to find out which ones.

peacekeeper commented 4 years ago

@OR13 the above tools implement 1-to-1 mappings between two formats. They don't have the concept of an abstract data model that supports multiple representations equally.

Let's assume you have a DID document like this (using a future text/did+yml representation; this is actually your example):

---
  id: "did:example:123"
  verificationMethod: 
    - 
      id: "#key-0"
      type: "JsonWebKey2020"
      controller: "did:example:123"
      publicKeyJwk: 
        kty: "EC"
        crv: "P-256"
        x: "iCsUt8CcUFRnm-5TAVLw6XxmTmUXwVLY_300nxguIPM"
        y: "9ZfGdDmvXvTXpBpbKGw_Rt86whDN9y3TMfgtJAOlV38"

What would this DID document look like if you translate it via the ADM to application/did+ld+json ?

And then what would it look like if you translate it back to text/did+yml ?

OR13 commented 4 years ago

@peacekeeper assuming todays spec language:

Assuming the did document original representation contained only registered properties:

  1. call YAML_TO_JSON on didDocument.yml

  2. if @context is present you are done ( as shown in the related example ).

Remember consuming JSON, @context MUST be ignored... I am assuming you would apply the same logic to YAML.

  1. if no @context is present, inject
 "@context": [
    "https://www.w3.org/ns/did/v1"
  ],
  1. optionally call expand/compact and drop any "unregistered properties" (not required, and MAY destroy the ability to losslessly go back to yaml).

In the case that all properties were registered you would get:

didDocument.jsonld

Now consider the none-trivial case:

Assume the did document contains registered properties and extensions as well as properties and extensions:

  1. call YAML_TO_JSON on didDocument.yml

  2. if @context is present you are done.

  3. if no @context is present, inject

 "@context": [
    "https://www.w3.org/ns/did/v1"
  ],
  1. optionally call expand/compact to drop any "unregistered properties". (not required, but MUST destroy the ability to losslessly go back to yaml).

Whats the difference?

When @context is treated like every other property.... it's value is preserved like every other property, and when it happens to be configured correctly, 0 properties MAY be dropped by "translation" to JSON-LD..

Now let's assume that @context is never preserved through the ADM....

  1. Only properties defined in https://www.w3.org/ns/did/v1 can be preserved through translation for JSON-LD, yet every other representation has all their properties preserved.
  2. Any representation that translates from JSON-LD with extensions to another representation cannot come back to JSON-LD with extensions (defeats purpose of the did spec registries, particular the JSON-LD part...).
  3. Instead of preserving an arbitrary map property, when we were capable of doing so, we intentionally broke interoperability.

There is no reason to specifically attack CBOR by forbidding integer mapping or NFC by banning TLV with leading 0's... We don't need to prohibit __proto__ or <script> in JSON... Let the DID Method be responsible for handling the production of the ADM in a concrete representation... end of the day, the did method is responsible for assembling the document.

The argument I am hearing from @peacekeeper @jricher is essentially:

All properties but those related to extensibility mechanisms in JSON-LD are preserved when translation through the abstract data model occurs.... yet we have no other extensibility mechanism but JSON-LD in the did spec registries...

Why are we specifically targeting JSON-LD, and attempting to break its extensibility mechanism with normative statements?

Why not just say "all property names and values of types known to the abstract data model are preserved".

If we don't want @context in JSON, YAML, XML, CBOR, CSV, PDF or whatever our DID Method supports... configure our DID Method's resolveRepresentation to NEVER produce a did document (in ANY representation) with an @context.

The spec and implementations are being dragged into complexity and uncertainty for no reason... All we need to say is:

All property names and values of types known to the abstract data model are preserved, DID Methods MAY choose which properties (registered AND unregistered) are present in a did document produced by resolveRepresentation.

mooreT1881 commented 4 years ago

Sent from my Samsung Galaxy smartphone.

-------- Original message -------- From: Ivan Herman notifications@github.com Date: 9/25/20 8:46 AM (GMT-06:00) To: w3c/did-core did-core@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [w3c/did-core] How should resolvers handle the accept header? (#417)

Shouldn't we have a closure on the media type issue first (#208https://github.com/w3c/did-core/issues/208)? application/did+ld+json may look very different in fact...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/w3c/did-core/issues/417#issuecomment-698938140, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AODDVNQK75KKUFDTMCQSND3SHSNLHANCNFSM4RZR2CVA.

peacekeeper commented 4 years ago

@OR13 you didn't quite answer my question. With your rules, if you translate YML->JSON-LD->YML, you get something different than the original YML. You get a purely JSON-LD specific construct added into YML which makes no sense there.

You want to "preserve" the JSON-LD specific @context construct in all other representations.

Would you also "preserve" the XML specific <?xml version="1.0"?> construct when you translate XML to JSON?

Would you also "preserve" the YAML specific %YAML 1.2 construct when you translate YAML to JSON-LD?

Are @context, <?xml version="1.0"?>, and %YAML 1.2 really "properties" in the ADM? Do they describe the DID subject?

No, they are representation-specific things that should be "dropped" when translating to other representations. Just like all English words should be "dropped" when you translate a book to Latin.

OR13 commented 4 years ago

@peacekeeper will you be asking all developers to be aware of all representation specific metadata associated with all did document representations for all time and to not include all of them?

how do you know that version isn't used in downstream did method related logic... if it was not important to keep, why did the did method include it?

You are arguing that you know better than the did method authors, but they are the ones who constructed the representation, and they are capable of supporting production in the representation directly... they could have chosen to support XML, JSON, YAML, CBOR... they didn't... yet now you want to translate their did documents without their consent, and drop properties from the only representation they chose to support.

HTTP/UDP/QUIC Resolvers are not DID Methods... what they produce should not alter what the did method author intended, or they are attacking the integrity of the did document.

Let me try and clarify:

  1. If you want a DID Document in representation X, ask for it in X from the DID Method (not some proxy resolver software that mutates data and drops properties, and maybe adds them).

  2. If you want to convert a DID Document you got in X to Y, you can do that in a number of ways... you could preserver all properties, you could change some properties, like key formats, and you could choose to drop properties as part of the conversion... you might want to replace public keys when you translate if you are being nasty...

We seem to be only arguing about 2.... I'm not sure 2 is even in scope for this working group, since its 100% a process that occurs after you have a concrete representation, as is HTTP resolution.

Consider that we don't need to support "translation" if a did method supports production... why convert YAML to CBOR, when you can just ask a did method to produce CBOR?

We MUST define resolveRepresentation as a function which always uses the READ property of a VDR.

We MUST NOT define resolveRepresentation as a function of a DID Document that was "consumed" from some known representation.

Put another way, how can I convey, as a DID Method Author that I do not want my representations "translated" in the way you are proposing?

Should we define a license property did resolution metadata, where I can put a copyright notice, and advise any potential downstream software that translation of my data model violates the license?

I'm a DID Method author, I don't want anyone to do what you are proposing, how can I make this clearer?

peacekeeper commented 4 years ago

@OR13 I think we have two competing objectives

I think both objectives are important.. Do you agree?

OR13 commented 4 years ago
  • We don't want DID method authors to be required to support each representation explicitly. If someone invents a new representation (e.g. did+yml), then we don't want to update >60 DID method specs and implementations. In this case, there must be clear rules how the new representation can be supported without changing anything about existing DID methods.

I am not sure I agree with this, I am concerned that the trust model of DIDs may be eroded if this is attempted incorrectly.

In the same way that "trusting http resolvers" is not the same as "running the did method yourself"... I see us potentially weakening DIDs by allows resolvers to pretend to be DID Document producers.... and I am concerned that we are subverting the authority and value of did methods that care to support multiple representations, by doing so....

In short, I am not sure I want to get a "translated" did document from an HTTP resolver... I think I would prefer to get an HTTP 406 instead.... especially given the current spec langauge regarding the ADM and production and consumption rules....

It also feels like we would be defining something unrelated to DID Document production, and stretching that definition to apply to HTTP resolvers, as a way of sneaking around the fact that did resolution was first considered out of scope by the WG...

peacekeeper commented 4 years ago

So for example in did:key, which doesn't have a built-in preference for any representation, and doesn't want to use any representation-specific features. It only uses a few of the Core Properties.

If someone invented a new representation such as did+yml, then that should just work out of the box, i.e. there doesn't have to be any new spec text or code that is method-specific. All that's needed is the generic READ operation for did:key -> ADM, and the new production rules for did+yml.

OR13 commented 4 years ago

https://w3c-ccg.github.io/did-method-key/

it only supports JSON, and it always has an @context, and its always both valid JSON and JSON-LD.

All that's needed is the generic READ operation for did:key -> ADM, and the new production rules for did+yml.

I would not pretend that calling YAML_to_JSON (or ADM magical translation) on a json object was the same as asking the did method for yaml representation.

It's easy to support YAML, conversion, for data formats that support conversion... but thats not the same as getting a YAML DID Document as it was intended from the did method.... my problem with your proposal is that it puts the Abstract Data Model in charge of how a DID Method's YAML representation looks... instead of the DID Method.... its IMO, not our job to do that.

msporny commented 4 years ago

Part of this issue will get resolved when we resolve the Abstract Data Model produce/consume issue. We have F2F time allocated to do this.

The original issue is a DID Resolver issue, not a DID Core issue, and the suggestion there is that it's out of scope for DID Core.

msporny commented 4 years ago

PROPOSAL: Close this issue once the data model/conversion issues are resolved with the Abstract Data Model since the original issue is out of scope for this WG.

OR13 commented 4 years ago

@msporny technically, resolveRepresentation takes the "accept" parameter... so although http headers are not in scope... figuring out how to handle a request for application/did+yaml when that representation is not supported IS in scope.

OR13 commented 4 years ago

but agree, that we are making progress on this.

peacekeeper commented 4 years ago

I agree with @msporny that this is out of scope. We defined resolve() and resolveRepresentation() as abstract functions with inputs and outputs, but we don't need to specify how exactly those inputs are processed (this can go into the DID Resolution spec).

OR13 commented 4 years ago

https://w3c.github.io/did-core/#did-resolution-input-metadata-properties

This property is OPTIONAL. It is only used if the resolveRepresentation function is called and MUST be ignored if the resolve function is called.

I'm not sure we should define properties and then not provide even basic answers to questions like: what happens if the DID Method resolveRepresentation does not support the requested accept parameter.

Implementers won't produce consistent concrete implementations of resolveRepresentation based on the spec text that exists today.

peacekeeper commented 4 years ago

When we added the sections on resolution and dereferencing, there was already a lot of discussion whether even that was in scope (several people thought it wasn't). The understanding was that we would only define abstract functions with their inputs and outputs, but we would not define how to implement those functions, or how the inputs are processed.

If we now start adding precise rules for processing the "accept" input metadata property, then we will also have to do that for other parts of the spec that affect resolution, such as the DID URL parameters, path component, etc. We would also have to spend a lot more time on defining error codes, and defining when certain error codes and other metadata are returned.

I think all of this should be out of scope and go into DID Resolution instead.

OR13 commented 4 years ago

@jricher @peacekeeper https://w3c.github.io/did-core/#did-resolution-input-metadata-properties

We are already defining error codes, so the question is:

If I request application/did+json from resolveRepresentation, and the DID Method was built to only support application/did+cbor what happens?

In the world of HTTP, when a client requests a format that is not supported by the server we have https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/406

In the world of resolveRepresentation we have ... ? undefined / do whatever you want?

kdenhartog commented 4 years ago

This gets very close to a rabbit hole that we ended up trying to avoid. I tend to agree with @peacekeeper that this should fall outside the scope of the abstract function and be defined concretely in the did-resolution specification. What if we move this issue over to that repo and discuss further there because it's a legitimate question that we should be addressing somewhere.