hyperledger / aries-rfcs

Hyperledger Aries is infrastructure for blockchain-rooted, peer-to-peer interactions
https://hyperledger.github.io/aries-rfcs/
Apache License 2.0
326 stars 217 forks source link

0067-didcomm-diddoc-conventions DID inline representation using DID Query #130

Closed SmithSamuelM closed 4 years ago

SmithSamuelM commented 5 years ago

Using DID Query for Key Material

Introduction

During the Aries WG call 2019/07/10 suggestion was made to use DID query parameters as a mechanism for providing key material, namely the crypto suite type used to create the signature, to inline or ephemeral expressions for DIDs used in the DID Comm protocol fields. This is useful for peer DIDs as well as other applications.

One use case is in the service block as follows:

{
  "service": [{
    "id": "did:example:123456789abcdefghi#did-communication",
    "type": "did-communication",
    "priority" : 0,
    "recipientKeys" : [ "did:example:123456789abcdefghi#1" ],
    "routingKeys" : [ "did:example:123456789abcdefghi#1" ],
    "serviceEndpoint": "https://agent.example.com/"
  }]
}

In the block above the "id" field, the receipientKeys and routing keys array values all could benefit in some cases from an inline representation of the key material (crypto suite type) to allow participants to verify signatures made with the private keys underlying the associated DIDs without having to lookup the associated DID Documents. The inline representation could also be treated as preload for key material cache that expires according to another DID query parameter. This provides a compact way of managing the expiration of stale ephemeral DIDs.

Potential Resolution Issues

The relevant documents on DID syntax are the DID spec itself (W3C DID Spec)[https://w3c-ccg.github.io/did-spec/] and the draft DID resolver specification (W3C DID Resolution Spec)[https://w3c-ccg.github.io/did-resolution/]

By way of background. The original design of the DID syntax and semantics included the idea that a DID could include many of the features of a URL (URI, URN). These features included path, query, and fragment components. Initially only the fragment had a specific use case where a special type of fragment component was identified called a DID Fragment (not to be confused with a generic URL fragment). The query and path were overlooked. Later discussion and pull requests repaired that by including the query and path components as formal parts of the DID ABNF. However over time the spec has evolved and now the only place that query and path show up are in ABNF is the new DID URL representation. It appears this was done to simplify the DID service endpoint resolution algorithm but some important semantics may have been lost in the process. The only use case described by the specification is for a DID that includes path, query, or fragment components is in a DID URL that includes a matrix parameter "service". The DID path and query on the DID URL would then be appended to the service endpoint url. This is both good and bad. Good because the spec does not prevent us from adding additional semantics to the query and bad because the spec may not provide insufficient guidance to implementers. Importantly this may be problematic WRT the the current DID Resolution algorithm spec. My reading of the algorithm is that it is either ambiguous or incomplete and leaves undefined some edge cases. It may be simply that I am misreading the algorithm.

The relevant clauses from the DID spec follow:

did                = "did:" method-name ":" method-specific-id
method-name        = 1*method-char
method-char        = %x61-7A / DIGIT
method-specific-id = *idchar *( ":" *idchar )
idchar             = ALPHA / DIGIT / "." / "-" / "_"
did-url            = did *( ";" param ) path-abempty [ "?" query ]
                     [ "#" fragment ]
param              = param-name [ "=" param-value ]
param-name         = 1*param-char
param-value        = *param-char
param-char         = ALPHA / DIGIT / "." / "-" / "_" / ":" /
                     pct-encoded

" 4.5 Path A generic DID path is identical to a URI path and MUST conform to the path-abempty ABNF rule in [RFC3986]. A DID path SHOULD be used to address resources available via a DID service endpoint. See Section § 5.6 Service Endpoints . A specific DID scheme MAY specify ABNF rules for DID paths that are more restrictive than the generic rules in this section.

4.6 Query A generic DID query is identical to a URI query and MUST conform to the query ABNF rule in [RFC3986]. A DID query SHOULD be used to address resources available via a DID service endpoint. See Section § 5.6 Service Endpoints . A specific DID scheme MAY specify ABNF rules for DID queries that are more restrictive than the generic rules in this section.

4.7 Fragment A generic DID fragment is identical to a URI fragment and MUST conform to the fragment ABNF rule in [RFC3986]. A DID fragment MUST be used only as a method-independent reference into the DID Document to identify a component of a DID Document (e.g. a unique key description). To resolve this reference, the complete DID URL including the DID fragment MUST be used as the value of the key for the target component in the DID Document object. A specific DID scheme MAY specify ABNF rules for DID fragments that are more restrictive than the generic rules in this section. It is desirable that we enable tree-based processing of DIDs that include DID fragments (which resolve directly within the DID document) to locate metadata contained directly in the DID document or the service resource given by the target URL without needing to rely on graph-based processing. Implementations SHOULD NOT prevent the use of JSON pointers ([RFC6901]). " Note that for the DID path, query only should is applied for usage in a DID URL with a service endpoint. Other usages are allowed. The DID fragment uses MUST but the use case is not sufficiently well defined and should allow other use when not the defined use case.

To be more specific, in the DID resolution spec the DID URL Resolution algorithm has a switch with 3 cases. These switches are summarized (intermediate processing steps elided) with the input conditions and resultant below. I assume that the rules are processed in order. Because the resultant actions specifies a return then I am assuming that the rule processing is aborted once a valid antecedent is reached. In other words If the antecedent is true Then the consequent is evaluated and no further rules are processed, that is, IF the antecedent is false then skip to next rule. If instead the rules are cumulative in that return does not mean return but means include in the result and all the antecedents are always checked then the rules may be valid.

A) IF the input DID URL is equal to the input DID itself: THEN Return the resolved DID Document.

Not sure how to interpret this. If the DID URL is just a valid expression for a DID then this rule will always be true even with a query, path, or fragment component. Which would mean the next two rules are not evaluated. If a DID URL with a path, query, or fragment is not equal to the DID itself then the resolver will discard the DID unless it satisfies B or C

B) IF the input DID URL contains the matrix parameter service and optionally a DID Path, DID Query, and/or DID Fragment: THEN Return the output service endpoint URL.

C) IF the input DID URL contains a DID Fragment::THEN Return the output resource. (JSON-LD object whose id property matches the input DID URL)

This seems to be too greedy as it does not allow for JasonP or resolution into the DID document. Not all fragments correspond to an id property. This seems to be problematic no matter the definition.

Semantics

In a conventional URL the path component may be empty. The path component, consists of a sequence of path segments separated by a slash (/). In a conventional URL (Not DID Url) a path is ALWAYS defined for a URI, though the defined path may be empty (zero length). When the path is empty there is some default resource that is ALWAYS provided. This is not the case for a DID URL that is when a path component is missing there is no defined default resource unless the DID URL includes a service matrix parameter. Then the default resource is provided by the service endpoint.

I suggest that there is a valid default resource that could be defined for a DID URL for all other cases except when the DID URL includes a service matrix parameter. That default would be the did document itself. This is consistent with the precedent set by the DID Fragment specification where the fragment resolves to an object within the DID document. To be consistent with the conventional URL syntax, a default path component in a DID URL would also ALWAYS be defined. The semantic is as follows: WHEN the DID includes a service matrix parameter THEN the default path is given by the service endpoint resolution algorithm OTHERWISE the default path resolves to the DID itself its meta-data from the DID Document. The resolution of a non-empty path ie no default would be method dependent

Likewise the semantics of DID query could also be defined: When the DID includes a service matrix parameter THEN the DID query is applied to the service endpoint. Otherwise the did query is applied to the resource specified by the path. When the path is empty then the default path which would be DID itself and its metadata from the DID document.

In the conventional URL usage, a query string modifies the resource specified by the URL. Originally the main use case was to provide the data values for the fields belonging to a form resource. This usage is evocative of the proposed use herein of providing the field values for a key material authentication block associated with the DID as a resource.

What remains then is to specify what the query parameters should be.

I suggest mimicking the field names from the Authentication block.

{
      "id": "did:example:123456789abcdefghi#keys-2",
      "type": "Ed25519VerificationKey2018",
      "controller": "did:example:123456789abcdefghi",
      "publicKeyBase58": "H3C2AVvLMv6gmMNam3uVAjZpfkcJCwDwnZn6z3wXmqPV"
    }

maybe put auth in front such as

did:example:12345678abcees?auth_type=Ed25519VerificationKey2018

another would be an expiration date which would force a did document resolution to get the most recent authentication block

did:example:12345678abcees?auth_type=Ed25519VerificationKey2018&auth_expires=20190712

TODO:

Clarify DID URL Resolution algorithm

Define Fields

Update associated specifications

Another use case of the DID query is for Hierarchically Deterministic Keychains. The Query string can include the derivation path for say a BIP-44 derived did. This is described in this paper:

https://github.com/WebOfTrustInfo/rwot6-santabarbara/blob/master/final-documents/DecentralizedAutonomicData.pdf

These now are two cogent use cases for the DID Query for some important purpose other than modifying a service endpoint.

SmithSamuelM commented 5 years ago

One of the advantages of having a default definition for the path even when the path component is missing is that fragment and query also have default definitions without any extra work. This also closes a semantic hole.

TelegramSam commented 5 years ago

Using query param syntax is I think most useful when adding type and format details to an inline key. It feels like an odd way to add a suffix, but it is a common format that doesn't seem to conflict with key encodings.

Example: ?keytype=xxx&keyformat=xxx

peacekeeper commented 5 years ago

@SmithSamuelM

the receipientKeys and routing keys array values all could benefit in some cases from an inline representation of the key material (crypto suite type) to allow participants to verify signatures made with the private keys underlying the associated DIDs without having to lookup the associated DID Documents.

The idea has recently come up that if certain fields of the DID Document reference a key, a DID Resolver could have a function to expand those referenced keys into inline keys: https://github.com/w3c-ccg/did-resolution/issues/39

My reading of the algorithm is that it is either ambiguous or incomplete and leaves undefined some edge cases.

Yes, the algorithm is at a very early stage and needs more work. For now, it only captures the three most common use cases that seem to have broadest support in the community (see your A, B, C below). There are some known ambiguities and edge cases in the algorithm which we'll try to fix asap.

Note that for the DID path, query only should is applied for usage in a DID URL with a service endpoint. Other usages are allowed. The DID fragment uses MUST but the use case is not sufficiently well defined and should allow other use when not the defined use case.

I agree we probably want to change this MUST. DID URLs with a "service" matrix parameter can also contain a fragment, in which case it is not a reference into the DID Document.

I assume that the rules are processed in order. Because the resultant actions specifies a return then I am assuming that the rule processing is aborted once a valid antecedent is reached.

This is correct. We should probably explicitly state this in the spec if it is not clear enough.

A) IF the input DID URL is equal to the input DID itself: THEN Return the resolved DID Document.

Not sure how to interpret this. If the DID URL is just a valid expression for a DID then this rule will always be true even with a query, path, or fragment component.

This is supposed to mean if the DID URL has an empty path, no query, and no fragment. "input DID" means only the did ABNF rule. In that case the "input DID URL" (did-url ABNF rule) is the same as the "input DID" (did ABNF rule).

Actually that's not entirely correct. For example, this rule should still trigger if certain matrix parameters are included as part of the DID URL, even though then the "input DID URL" is not the same as the "input DID".

B) IF the input DID URL contains the matrix parameter service and optionally a DID Path, DID Query, and/or DID Fragment: THEN Return the output service endpoint URL.

Correct.

C) IF the input DID URL contains a DID Fragment::THEN Return the output resource. (JSON-LD object whose id property matches the input DID URL)

This seems to be too greedy as it does not allow for JasonP or resolution into the DID document. Not all fragments correspond to an id property. This seems to be problematic no matter the definition.

I agree this can probably be improved, although I do consider this particular use of fragment (corresponding to an id field) the "main" use. This is a core feature of RDF and Linked Data web architecture.

Note that section 5.3 lists additional features that haven't been incorporated yet into the algorithm, among them JSON Pointers, see section 5.3.3.

There is at least one more use of fragments in the community; in did:ipid, the fragment of a DID URL contains a secret key used for decrypting (parts of) the DID Document.

When the path is empty there is some default resource that is ALWAYS provided. This is not the case for a DID URL that is when a path component is missing there is no defined default resource unless the DID URL includes a service matrix parameter. Then the default resource is provided by the service endpoint.

I suggest that there is a valid default resource that could be defined for a DID URL for all other cases except when the DID URL includes a service matrix parameter. That default would be the did document itself.

+1. The default is the DID Document itself. This is exactly what rule A) (see above) is trying to say. Sorry if the language isn't clear enough, I'd be happy to reword this.

This is consistent with the precedent set by the DID Fragment specification where the fragment resolves to an object within the DID document.

+1

The resolution of a non-empty path ie no default would be method dependent

I think I agree with this too. In the algorithm, I was thinking that anything that isn't captured by one of the specified rules would essentially be method-specific. Using a non-empty path is also one option (among others) that have been explored for referencing non-DID-Document resources in the Decentralized Identifier Registry (e.g. schemas, credential definitions in the case of Sovrin).

Likewise the semantics of DID query could also be defined:

We have talked very little so far about the use of DID query, and I'd like to explore this further. I supposed it could be used for a kind of query language as you mention (e.g. selecting service blocks by type). But wouldn't this already be covered by JSON Pointer? Matrix parameters service-type and key-type have also been proposed for this.

@dmitrizagidulin has commented that query parameters are scoped to the URLs authority, i.e. we are not supposed to define universal query parameters that have the same semantics across all DIDs.

@talltree has argued in his Algorithmic Construction doc that the DID Spec should not restrict the semantics of path, query, fragment at all, to leave freedom for developers to use them any way they want, and that the use of matrix parameters is preferable.

SmithSamuelM commented 5 years ago

@peacekeeper

The idea has recently come up that if certain fields of the DID Document reference a key, a DID Resolver could have a function to expand those referenced keys into inline keys: w3c-ccg/did-resolution#39

That suggestion is not the same thing. expanding keys within the context of a DID Document is orthogonal to having an inline DID that does not need to reference or lookup a DID Document or the DID document is not essential to some uses of the DID.

This may be the core of the issue/misunderstanding. DIDs have use cases where an "ephemeral" DID is used. In these use cases the authentication information aka the crypto suite type may be provided by a query string on the DID.

SmithSamuelM commented 5 years ago

Note that for the DID path, query only should is applied for usage in a DID URL with a service endpoint. Other usages are allowed. The DID fragment uses MUST but the use case is not sufficiently well defined and should allow other use when not the defined use case.

I agree we probably want to change this MUST. DID URLs with a "service" matrix parameter can also contain a fragment, in which case it is not a reference into the DID Document.

@peacekeeper What is the process for following up on getting this changed. Or resolving any issues that would keep if from being fixed?

SmithSamuelM commented 5 years ago

@peacekeeper Based on your answers to ABC Above there is a serious problem with the DID resolver logic. It will discard(not resolve) any DIDs that have query strings that are also not service endpoint specific DID URLs. This is wrong =) The logic should not discard DIDs in these cases. The DID URL service endpoint is a special case not the other way around.

SmithSamuelM commented 5 years ago

We have talked very little so far about the use of DID query, and I'd like to explore this further. I supposed it could be used for a kind of query language as you mention (e.g. selecting service blocks by type). But wouldn't this already be covered by JSON Pointer? Matrix parameters service-type and key-type have also been proposed for this.

The JSON pointer references into a DID Document so it is not useful for an ephemeral DID that does not lookup a DID Document. The Query makes the DID self-contained. This is an important use case. It seems that the focus of DID resolvers has narrowed so much that it assumes too much. A DID resolver should still resolve a DID and return it even if the DID has a query string but is not part of a DID URL that resolves to a service endpoint. This should be obvious otherwise the DID spec would have to limit the use of DID query to only the service endpoint case.

@dmitrizagidulin has commented that query parameters are scoped to the URLs authority, i.e. we are not supposed to define universal query parameters that have the same semantics across all DIDs.

I am not suggesting universal query parameters. I am stating that any DID with query parameters that is not part of a service endpoint matrix parameter DID URL is not resolvable with the current DID resolver logic. This prevents any use of the query except as part of a service endpoint.

@talltree has argued in his Algorithmic Construction doc that the DID Spec should not restrict the semantics of path, query, fragment at all, to leave freedom for developers to use them any way they want, and that the use of matrix parameters is preferable.

I agree except that the default semantics should be defined as the default path resolves to the DID document. A service endpoint is a different use case in that case the path query and fragment should be left up to the service endpoint to decide. But when no service endpoint is provided then it makes perfect sense and follows the current precedent of the fragment for the path to resolve to the DID Document. How the query is interpreted in that case is method dependent. But the current resolver logic discards any DID with a Query that is not a service endpoint. That is the point of my comment and questions. It seemed that that was the case but I could not be sure. Based on your answers (as I understand them) that indeed is the case and IMHO needs to be fixed.

SmithSamuelM commented 5 years ago

The majority of this discussion may be more appropriately considered an issue for the DID resolver specification.

I have created an issue https://github.com/w3c-ccg/did-resolution/issues/42

To continue the discussion.

TelegramSam commented 4 years ago

Discussed on the Aries WG Call 2019-12-11, and decided that it was complete.