Closed dmitrizagidulin closed 2 years ago
@msporny - we went with the colon syntax so that we could enable DID relative fragments, as described in DID Core in Example 6. (Although I personally don't see the usefulness of relative DID dereferenced URLs, I do want to respect the spec.)
@dmitrizagidulin,
Looks like you found a typo in that example (and I saw at least one more). The value MUST be percent encoded, so there would be no slashes there. The normative definition for relativeRef
is correct though:
If present, the associated value MUST be an ASCII string and MUST use percent-encoding for certain characters as specified in RFC3986 Section 2.1.
https://www.w3.org/TR/did-core/#did-parameters
In other words, using /
in your DID should have nothing to do with whether or not you can use relativeRef
, the value of relativeRef
MUST be percent encoded.
Ping for reviews / feedback otherwise will close in a week or 6 months.
@OR13 - any objections to this PR?
This PR results in asymmetric handling of URL paths with percent-encoded characters.
To avoid breaking those URLs, those would need double-encoding.
I think we're better off scrapping the colon-delimited scheme.
This syntax breaks interop with the PATH parameter in a DID URL, and needlessly introduces breaking changes, unless a substantial number of existing implementers come forward planning to support this, i suggest we close.
Similarly, mesur.io does not intend to implement this or other breaking changes
Cross posted to did core, https://github.com/w3c/did-core/issues/821
removing pending close label.
This syntax breaks interop with the PATH parameter in a DID URL
the value of relativeRef MUST be percent encoded.
I don't understand what this PR has to do with either the path in a DID URL, or with relativeRef.
This only seems to be about the method-specific-id of the DID, not anything else in a DID URL.
I think the PR is fine.
I don't think addressing this issue requires a breaking change to the current did:web method. IMO it could be addressed by simply adding some language saying something like "Colon characters MUST NOT be used in path elements for the target HTTPS URL".
@peacekeeper wrote:
I don't understand what this PR has to do with either the path in a DID URL
DID:WEB rewrites https URL paths to fit them into the method-specific-identifier
. This is necessary because the DID core spec reserves DID URL path
elements for navigation within a single DID document, and this is the reason there is confusion around percent encoding and percent decoding.
It would be preferable to handle DID URL path
elements the same way that RFC 3986 handles generic URI path
elements - as information locating the resource (DID document) rather than something inside the resource. URI fragment
elements are used to locate information inside a resource.
The same logic applies to query
elements.
For the most part, the logic should align with RFC3986, and special behavior does not need to be defined. Instead, you would be better off referencing the relevant parts of that RFC and defining test vectors.
E.g. on Percent-Encoding:
A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component.
Typically a percent-encoded character has no distinct semantic meaning if the encoded characters did not meet this requirement and could have been representable without encoding. However, this determination is typically done as part of the interpretation of the URL, possibly with some language library help.
For example, the use of application/x-www-form-urlencoded
data within a query or fragment is not defined by the base URI spec or the definition of the HTTP URI scheme, but is so common that people may assume so.
As such, I would recommend the following behavior for converting a did:web
to a https
URL.
:
) into individual components.pchar
as defined in RFC3986)did.json
https
scheme, the determined host, the optional port, and the path, along with any originally specified fragment.For the most convoluted example I can manufacture did:web:%F0%9F%92%A9.la%3A8443:foo%3Abar%2Fbaz#x
:
Note: This includes punycode, which may very well be something that is explicitly not supported.
%F0%9F%92%A9.la%3A8443
foo%3Abar/baz
💩.la:8443
[💩.la](https://xn--ls8h.la/)
8443
xn--ls8h.la
For our single additional component, foo%3Abar%2Fbaz
:
foo:bar/baz
pchar
:
foo:bar%2Fbaz
foo:bar%2Fbaz
did.json
foo:bar%2Fbaz/did.json
https://xn--ls8h.la:8443/foo:bar%2Fbaz/did.json#x
@gribneau
This is necessary because the DID core spec reserves DID URL
path
elements for navigation within a single DID document
No it doesn't. Do you have a reference where you found this information?
It would be preferable to handle DID URL path elements the same way that RFC 3986 handles generic URI path elements - as information locating the resource
That's actually how it is. A DID URL with a path can be used to locate any type of resource, even an image, arbitrary JSON data, PDF, etc. And the fragment is used to reference a secondary resource that is part of, or related to, the primary resource.
did:indy
is an interesting method that uses DID URLs with paths to locate resources such as schemas, credential definitions, etc:
https://hyperledger.github.io/indy-did-method/#did-urls-for-indy-object-identifiers
@OR13 wrote:
As of today, Transmute does not intend to implement this breaking change.
@mprorock wrote:
Similarly, mesur.io does not intend to implement this or other breaking changes
path
isn't defined in Section 2.3 Method-specific identifier -- you can't make a breaking change to something that was never defined. When are the spec editors going to fix this bug? :)
I also want to make sure everyone understands the consequences of keeping the specification, which is broken when it comes to colon processing (and URL processing, in general), as-is. Here are places where colons can legitimately be placed that will break did:web implementations if the spec doesn't change in a breaking way:
host
of the authority
section: Colons are used for IPv6 addresses, which are used extensively for IoT and can be placed in the SAN field of an X509 certificate. While the did:web spec says you're forbidden from using IP addresses in TLS certs, it is a thing as defined in RFC3779 and RFC6487. IoT devices use TLS certificates that contain IPv6 addresses in the SAN field -- we might want to check in with the WoT if this is an issue for them. The last time I checked, they were using did:key instead, but were considering did:web.userinfo
in the authority
section of the scheme. Colon is a valid subdelimiter. The (incomplete) encoding rules in 2.3 Method-specific identifier, coupled with the decoding rules in 2.4.2 Read (Resolve) result in did:web URLs that will be mistransformed before being retrieved.path
meant path
in RFC3986, which means that slashes and colons are allowed characters. I'll include the text from the specific section in RFC3986 below:As you can see above... path
includes the colon character... and slash characters. Colons are used to delimit user accounts on Mediawiki and Wikipedia:
https://en.wikipedia.org/wiki/Wikipedia:Wikipedians
https://commons.wikimedia.org/wiki/User:Ser_Amantio_di_Nicolao
Are we starting to see the problem here? This PR is at least attempting to fix the "colons in segments" issue... but, unless I'm missing something, the problem runs much deeper than that. What am I missing? Where is path
defined?
I'm in favor of implementing @dwaite 's suggestion, I would approve a PR that implements it and provides test vectors.
I don't think we should address each encoding issue that deviates RFC 3986 in a 1 off section of text in a CCG draft, that will quickly lead to a painful spec, that does not make good use of existing normative references or provide concrete test vectors for proving interop.
I'm in favor of implementing @dwaite 's suggestion, I would approve a PR that implements it and provides test vectors.
Or we could replace that complex algorithm (which, don't get me wrong is impressive, @dwaite) with this:
Convert to did:web:
Convert to https:
... and get rid of all of the unnecessary complexity and deviation from RFC 3986. :)
Or we could replace that complex algorithm
I never really understood why people invented did:web in the first place instead of just using https://
I never really understood why people invented did:web in the first place instead of just using https://
Because we needed:
In theory, we could:
application/did+*
-- but then the people that don't like content negotiation might get cranky -- and the people using did:web today will /definitely/ get cranky. :)I expect trying those things will be messier than just getting did:web right. :)
Perhaps you should register did https?
@gribneau
This is necessary because the DID core spec reserves DID URL
path
elements for navigation within a single DID documentNo it doesn't. Do you have a reference where you found this information?
It would be preferable to handle DID URL path elements the same way that RFC 3986 handles generic URI path elements - as information locating the resource
That's actually how it is. A DID URL with a path can be used to locate any type of resource, even an image, arbitrary JSON data, PDF, etc. And the fragment is used to reference a secondary resource that is part of, or related to, the primary resource.
DID URLs are not supported in the id
element of the DID document. Only a DID (and not a DID URL) can represent the entire document.
DID:WEB rewrites https URL paths to fit them into the
method-specific-identifier
. This is necessary because the DID core spec reserves DID URLpath
elements for navigation within a single DID document, and this is the reason there is confusion around percent encoding and percent decoding.
As @msporny pointed out, the path
part of a DID URL is not a part of the method-specific ID. There's a conflation going on here that appears to be confusing some contributors. I thought Manu was confused at first, but his smiley face convinced me he was trying humor to point out the distinction.
It would be preferable to handle DID URL
path
elements the same way that RFC 3986 handles generic URIpath
elements - as information locating the resource (DID document) rather than something inside the resource. URIfragment
elements are used to locate information inside a resource.
DID-URL path
elements are handled the same way that RFC3986 handles generic path
elements. It is just that the DID-URL != DID. And the method-specific-id is part of the DID--and only by inclusion is it part of the DID-URL. By which I mean the path part of the DID-URL is a completely different component than the URL path of did:web that gets encoded into the method-specific identifier. You could have a did:web DID-URL with a path part above and beyond the path encoded in the DID itself. The intepretation of that path part is entirely up to the did:web spec.
What I think is causing some confusion is that the resource involved in a DID URL is NOT the DID Document (as implied by @gribneau).
For example,
did:ex:abc
refers to an arbitary subjectdid:ex:abc#key1
generally refers to the "key1" node in the DID Documentdid:ex:abc/key1
has no consensus meaningdid:ex:abc?key1=abc
has no consensus meaning.The meaning of fragment, path, and query parts is up to the DID Method to define, as long as their representation of those parts in the DID-URL itself is consistent with RFC3986.
To wit, with did:cosmos
we use the path part to identify downloadable or interactive resources defined in the linkedResource property of the DID Document (an IID Resource) and use the fragment part to identify addressable entities within the namespace of the DID (called an IID Reference). These are based on the Interchain Identifier Specification at https://github.com/EarthProgram/Identifiers/blob/main/index.md.
did:cosmos DID URLs for IID resources are "locating the resource", which you might think of that as "in the DID (namespace)", in the same way that normal web resources are "in the website's namespace".
Note that "normal" URLs point to different resources by having different paths. http://example.com/file1.png
and http://example.com/file2.png
are different resources, both mediated by the authority part of example.com (the actual resources may be on a server anywhere thanks to redirection).
DID URLs, especially how they are used in did:cosmos (and other IIDs) match this behavior exactly. Each DID-URL with a path part can separately point to different resources, just like regular URLs.
I believe Orie is just fixing a round-trip encoding algorithm problem with did:web in particular, which has nothing to do with did-core.
So let's not conflate the resource of a DID-URL with the Subject of the DID nor the DID Document. The resource of a DID-URL is defined within the context of the DID and likely declared or presented within the DID Document, but they can refer to ANY resource, not just the DID Document.
I am not confused @jandrieu, I simply disagree with the core specification's interpretation of RFC 3986.
@msporny wrote:
Note that there will be a PR to deprecate the colon syntax and prefer just straight HTTP URL translation in time.
This cannot currently happen because the DID Subjects of both of these would be did:web:example.com
, which is inappropriate for obvious reasons:
did:web:example.com/alice
did:web:example.com/bob
The limitation is imposed by section 5.1.1.
In the presence of that limitation, the handling of path
and query
in DID URLs can only be seen as consistent with the the fragment in RFC 3986, which is distinguished from path
and query
by virtue of the secondary resource reference:
The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information.
In contrast, both path and query "serve to identify a resource", while the authority preceding them does not identify a resource at all.
It is unfortunate that the path
and query
sections of the RFC do not use the "primary resource" terminology. This confusion might have been avoided if they had.
@gribneau wrote:
This cannot currently happen because the DID Subjects of both of these would be did:web:example.com, which is inappropriate for obvious reasons
Yep, @gribneau is correct... several technical issues:
1) DID Core is too restrictive when it comes to the id
field in a DID Document. We should've supported a DID URL instead of just a DID in that position -- we can still fix this in DID v2.0 (since that would just be an expansion of the current normative statement).
2) did:web has yet to properly define what it meant by path
... which led some to interpret it as path
per RFC 3986, while others interpreted it as "some complex new way of encoding URL paths using colon syntax", which creates multiple incompatibilities with RFC 3986 when it comes to round-tripping these values. All of this would be much easier if we could all just agree that did:web's method-specific-identifier
ABNF is just plain incomplete/wrong.
3) This PR tries to fix the latter to make colon-path encoding work in a way that could be more interoperable, but it might be that this whole endeavor is misguided to begin with (and did:web is thoroughly broken in its current state).
In other words, use the rules in the did:web spec to transform these HTTPS URLs into did:web DID URLs and back out again:
https://foo.example/users:jane
https://foo.example/users:jane#keys:1
https://foo.example/users:jane?timestamp=2022-04-22T19:55:27.730Z#keys:1
I suggest that there are no rules in the did:web spec that tell you how to round trip those URLs. If someone knows of any, please point them out to me 'cause I can't find them in the spec.
One interpretation is that method-specific-id
was always meant to be just the URL-encoded authority in did:web -- so, for the above, the "method-specific-id" is foo.example
... everything up to the first slash... and the subject identifier in the DID Document was everything up to the first ?
or #
or the end of the URL -- for example did:web:foo.example/users:jane
... where you could state something like the following and it would make sense:
"verificationMethod": "did:web:foo.example/users:jane#keys:1"
OR
"verificationMethod": "did:web:foo.example/users:jane?timestamp=2022-04-22T19:55:27.730Z#keys:1"
... but it seems like some are saying (without this current PR), no, the proper encoding of those URLs is actually this:
"verificationMethod": "did:web:foo.example:users:jane#keys:1"
OR
"verificationMethod": "did:web:foo.example:users:jane?timestamp=2022-04-22T19:55:27.730Z#keys:1"
which, when going back through round tripping (per the rules in the current did:web specification) would be turned into these HTTPS URLs:
"verificationMethod": "https://foo.example/users/jane#keys/1/did.json"
OR
"verificationMethod": "https://foo.example/users/jane?timestamp=2022-04-22T19/55/27.730Z#keys/1/did.json"
The former approach seems to work (and is undocumented) the latter approach is broken (note all the crazy slashes that exist in the URL that shouldn't be there... this seems to be what some in this thread are suggesting the current spec states). Or multiple variations of different interpretations in between. What am I missing? Can someone round-trip those URLs for me in a way that is consistent?
DID Core is too restrictive when it comes to the id field in a DID Document. We should've supported a DID URL instead of just a DID in that position -- we can still fix this in DID v2.0 (since that would just be an expansion of the current normative statement).
IMHO, the problem is that the usage of DID path is not restrictive enough - there isn't currently a way to differentiate a DID URL path that conforms to the described behavior of the method on DID "CRUD" operations. The resource might instead be hosted content or a service, and may support additional interactions outside the DID resolution definition.
@msporny wrote:
- DID Core is too restrictive when it comes to the
id
field in a DID Document. We should've supported a DID URL instead of just a DID in that position -- we can still fix this in DID v2.0 (since that would just be an expansion of the current normative statement).
I agree. This is an easy fix. It was, however, discussed and rejected prompting some of what we have now.
- did:web has yet to properly define what it meant by
path
... which led some to interpret it aspath
per RFC 3986, while others interpreted it as "some complex new way of encoding URL paths using colon syntax", which creates multiple incompatibilities with RFC 3986 when it comes to round-tripping these values.
The did:web
method is currently broken for URL paths containing colons. It becomes more broken if IPv6 addresses are allowed, or when one attempts simple authentication.
The create section is clearly inadequate as well. Specific steps should be provided in addition to the handful of examples provided.
All of this would be much easier if we could all just agree that did:web's
method-specific-identifier
ABNF is just plain incomplete/wrong.
I don't disagree. It is, at best, an 80% solution today.
I agree. This is an easy fix. It was, however, discussed and rejected prompting some of what we have now.
Hrm, rejected by whom? Speaking as the lead DID Core spec editor, I don't recall that we've made any consensus decisions of the sort. That idea is still very much alive and well, IMHO.
The best way to address the problem in DID Core, however, is to get folks to admit it's a problem here and then bring that problem back to DID Core (with a fairly simple errata fix to the spec noting that we plan to expand didDocument.id
's range to did-url in the future).
The create section is clearly inadequate as well.
Agreed.
It is, at best, an 80% solution today.
You're being generous. :)
Shipping specs with known bugs, especially ones as big as this, are a standards anti-pattern.
@dwaite wrote:
IMHO, the problem is that the usage of DID path is not restrictive enough - there isn't currently a way to differentiate a DID URL path that conforms to the described behavior of the method on DID "CRUD" operations.
Can you explain this a bit more, @dwaite?
The resource might instead be hosted content or a service, and may support additional interactions outside the DID resolution definition.
... and a bit more detail on this one as well, please?
@msporny wrote:
Shipping specs with known bugs, especially ones as big as this, are a standards anti-pattern.
Is that a reference to did:web or the core?
The only required elements for a URI are path
and scheme
(and even scheme
can be omitted for relative references). The path
element always exists, and when it is not explicitly included, it exists as a zero length string.
Given that a URI identifies a resource, and given that scheme
does not identify a resource, and given that path
is the only other required element, I have no idea how we conclude that path
can be interpreted as identifying a secondary resource relative to the primary resource.
The best way to address the problem in DID Core, however, is to get folks to admit it's a problem here and then bring that problem back to DID Core (with a fairly simple errata fix to the spec noting that we plan to expand
didDocument.id
's range to did-url in the future).
I think it would be okay change this, i.e. also allow DID URLs for "id", but I wouldn't go as far as calling it "errata". We did have some discussions about this in the WG, and I think there were also legitimate opinions for now allowing it. Found the following older issues that could be relevent:
The best way to address the problem in DID Core, however, is to get folks to admit it's a problem here and then bring that problem back to DID Core (with a fairly simple errata fix to the spec noting that we plan to expand
didDocument.id
's range to did-url in the future).
I think this notion would fundamentally break the semantic architecture of DIDs.
The DID Document represents the verification relationships (and methods) and service endpoints for an identifier, the DID.
The DID represents the authority part of the RFC3986 standard URL syntax. As such, the DID Document is the metadata that represents assertions by the authority for interacting with that DID, including identifiers within that DID namespace as delineated by DID URLs.
It makes for a straightforward resolution process that parallels DNS quite nicely. You have a DID or DID URL
If you allow a DID Document's ID to be for a particular DID URL, then how do you look up the DID Document for the DID part of that DID URL?
In other words, if did:ex:abc/resource1
somehow resolves to DID Document A with an id of did:ex:abc/resource1
and did:ex:abc/resource2
resolves to a different DID Document B (with did:ex:abc/resource2
as an ID), then how do we change the above algorithm? Because both DIDs will first resolve to a DID Document for did:ex:abc
, whose id MUST be did:ex:abc
. You haven't really fixed anything with the insistence that the DID Document's id be able to be a DID URL.
It may be that the way you are hoping to use these path parts is an instance of 'turtles all the way down'. The fact is, the turtles have to stop somewhere. That somewhere is the authority part. That's where the buck stops. That authority part presents the necessary metadata as the DID Document for that authority. It is NOT the metadata for the DID URL resource. Just the metadata for the DID.
The way we (with IIDs) use DID URLs to reference IID Resources and IID References, as defined in the DID Document, allows us to add per-resource meta-data as appropriate. This may be a better way for you to think through your solution. http://w3id.org/earth/identifiers
IMO, it would be a colossal error to violate this separation of responsibilities and allow DID Documents to have IDs that are DID URLs.
Can you describe the use case that requires this feature? What's the value-adding interaction as user would get out of this feature?
Can you describe the use case that requires this feature?
The Web. :)
What's the value-adding interaction as user would get out of this feature?
The ability to serve two different resources from the same authority -- which is what the Web does.
If you allow a DID Document's ID to be for a particular DID URL, then how do you look up the DID Document for the DID part of that DID URL?
You use a resolver and use whatever it gives back to you. Remember, DID Methods are what determine what you get back when you resolve something.
Let's take an example:
RESOLVE did:web:subject.example/people/jane
Plug that into a resolver and you might get a DID Document that looks like this:
{
"id": "did:web:subject.example/people/jane"
}
That's one subject... but try this and you might get nothing (jane
is missing, it's just now a random directory on the Web):
RESOLVE did:web:subject.example/people
... but try this and you might get the authority (aka DNS domain) DID Document:
RESOLVE did:web:subject.example
{
"id": "did:web:subject.example"
}
DID Core does not allow for that fairly sane thing to happen today... that's the error we made in the DID WG.
Can you describe the use case that requires this feature?
The Web. :)
That's not a use case. That's a platform. Which already works great.
We aren't recreating the web. We are creating something different, or at least expanding the web in new directions.
Again, what's the value-added use case? What user does what to get what value?
did:web:subject.example/people/jane
Why on earth would resolving that give you a DID Document with that as the id? It wouldn't. It would give you a DID Document with did:web:subject.example
as the ID.
DID URLs do not and never have resolved to DID Documents. They dereference to resources.
Now, I can, as I described in a different answer, define did:web:subject.example/people/jane
so that it dereferences to a DID Document, but that DID Document is not the DID Document for that DID URL, because there is no such thing.
DIDs resolve to DID Documents. DID URLs do not.
Full stop.
That's not a use case. That's a platform. Which already works great.
We aren't recreating the web. We are creating something different, or at least expanding the web in new directions.
If I were to accept your interpretation of DID Core, then we have created something that is incompatible with large portions of the Web. :)
At this point, I expect that you haven't actually read the algorithms in the DID Web Method spec... knowing you (at some level), I expect you'd be just as confused as I am if you were to read the text in the method specific id and the Read section of the spec. You would probably see that the method specific id is defined incorrectly (as it allows two different path-abempty
definitions to happen, which conflicts with DID Core in a way that cannot be round-tripped). You would probably also understand that the Read section uses an algorithm that's not round-trippable when coupled with RFC 3986. There are just factual errors there. Can you please at least confirm those two things so I know what we're at least on the same page wrt. the technical issues with the current did:web specification?
Again, what's the value-added use case? What user does what to get what value?
The ability to publish multiple DID Documents on a single DNS domain without using this weird/broken colon-path syntax that did:web uses (that is clearly broken and not round-trippable in the spec, per the comments above).
did:web:subject.example/people/jane
Why on earth would resolving that give you a DID Document with that as the id? It wouldn't. It would give you a DID Document with
did:web:subject.example
as the ID.
Weird, I have never thought that that's where we were going with DID Core. :P
DID URLs do not and never have resolved to DID Documents.
Where does DID Core state that DID URLs can never resolve to DID Documents?
They dereference to resources.
... and those resources might be DID Documents themselves. :)
Now, I can, as I described in a different answer, define
did:web:subject.example/people/jane
so that it dereferences to a DID Document, but that DID Document is not the DID Document for that DID URL, because there is no such thing.DIDs resolve to DID Documents. DID URLs do not.
Full stop.
Citation required. :)
Here are the citations that back up my point, which is that a DID URL can be resolved to a DID Document, like Section 7.2 DID URL Dereferencing:
contentStream ... The contentStream MAY be a resource such as a DID document that is serializable in one of the conformant representations, a Verification Method, a service, or any other resource format that can be identified via a Media Type and obtained through the resolution process.
So, the DID Core spec is either internally inconsistent or tragically limiting -- the "tragically limiting" perspective says that you can use a DID URL to get a DID Document, but when you get that DID Document, the identifier isn't going to be for the resource you fetched!
Interestingly, there is a path forward here without changing anything.
5.1.1 requires the DID Subject to conform with 3.1, which in turn asserts that RFC3986 controls.
RFC3986 3.3 provides that:
A path is always defined for a URI, though the defined path may be empty (zero length).
It seems, then, that these are equivalent:
did:example:123456789abcdefghijk
did:example:123456789abcdefghijk/
That's not a use case. That's a platform. Which already works great. We aren't recreating the web. We are creating something different, or at least expanding the web in new directions.
If I were to accept your interpretation of DID Core, then we have created something that is incompatible with large portions of the Web. :)
Unfortunately, that's a hyperbolic and disingenuous response. On the one hand, of course DIDs are incompatible with large portions of the web: not a single browser supports them. On the other hand, you provide no explanation of what this means. It is an empty attack without foundation. What parts of the web are now broken?
At this point, I expect that you haven't actually read the algorithms in the DID Web Method spec... knowing you (at some level), I expect you'd be just as confused as I am if you were to read the text in the method specific id and the Read section of the spec. You would probably see that the method specific id is defined incorrectly (as it allows two different
path-abempty
definitions to happen, which conflicts with DID Core in a way that cannot be round-tripped). You would probably also understand that the Read section uses an algorithm that's not round-trippable when coupled with RFC 3986. There are just factual errors there. Can you please at least confirm those two things so I know what we're at least on the same page wrt. the technical issues with the current did:web specification?
That's funny. I would expect that opposite in that none of the examples you've used use the current syntax for did:web. There are not two different path-abempty
definitions. There is a path that is encoded into the method-specific-id and the path part of the DID URL itself. Note that the did:web spec defines NO path-abempty. In fact, it provides no ABNF whatsoever. So, your confusion is understandable, but it's not a problem in did-core, it's just a gap in did:web.
I understand the round-trip problems Orie is attempting to fix and to my initial analysis, he is correct that %encoding colons is the simple fix.
None of that has anything to do with did-core. Yes. We should fix did:web. But did:core has a particular and distinct differentiation between DIDs and DID URLs.
You may recall that I warned you, @talltree, and @peacekeeper that the term DID URL is going to confuse people. People will see DID URLs and expect them to be DIDs. I don't have a better term for DID URLs, but I believe your argument is an excellent example of the problem I raised back then: even an editor of the DID Core Specification is confusing the two.
Again, what's the value-added use case? What user does what to get what value?
The ability to publish multiple DID Documents on a single DNS domain without using this weird/broken colon-path syntax that did:web uses (that is clearly broken and not round-trippable in the spec, per the comments above).
I'm sorry, but that still isn't a use case. It's just a broken round-trip algorithm for a particular DID method. Once did:web fixes it with encoding, we are good to go. You may be frustrated to have to encoded your colons in did:web, but that's no more relevant than the frustration I've had debugging web apps and having to figure out when to use URL encoding and when not to, especially when the different parts of URLs have different encoding rules. It's sometimes complicated. But "not doing the rigorous thing you need to do to make it work" is not itself a use case.
If you percent encode your colons, did:web works just fine.
did:web:subject.example/people/jane
Why on earth would resolving that give you a DID Document with that as the id? It wouldn't. It would give you a DID Document with
did:web:subject.example
as the ID.Weird, I have never thought that that's where we were going with DID Core. :P
Weird, I would have thought you understood DID Core.
DID URLs do not and never have resolved to DID Documents.
Where does DID Core state that DID URLs can never resolve to DID Documents?
DID Core never states that DID URLs resolve to anything.
They dereference to resources.
... and those resources might be DID Documents themselves. :)
Yes. But they are not the DID documents of the DID in the DID URL. They could refer to anything, any resource. But that basically doesn't mean anything in this context. What we care about is the DID document that is returned from resolution of a DID.
There is no resolution defined for a DID URL.
Now, I can, as I described in a different answer, define
did:web:subject.example/people/jane
so that it dereferences to a DID Document, but that DID Document is not the DID Document for that DID URL, because there is no such thing. DIDs resolve to DID Documents. DID URLs do not. Full stop.Citation required. :)
Here you go: In the Section 1.3 Architecture Overview https://www.w3.org/TR/did-core/#architecture-overview
DIDs are resolvable to DID documents. A DID URL extends the syntax of a basic DID to incorporate other standard URI components such as path, query, and fragment in order to locate a particular resource—for example, a cryptographic public key inside a DID document, or a resource external to the DID document.
Note that DIDS are resolvable. In contrast, DID URLs locate particular resources.
Other statements about DID URLs:
DID URL dereferencers and DID URL dereferencing A DID URL dereferencer is a system component that takes a DID URL as input and produces a resource as output. This process is called DID URL dereferencing. The process of DID URL dereferencing is elaborated upon in § 7.2 DID URL Dereferencing.
DID fragment The portion of a DID URL that follows the first hash sign character (#). DID fragment syntax is identical to URI fragment syntax.
DID path The portion of a DID URL that begins with and includes the first forward slash (/) character and ends with either a question mark (?) character, a fragment hash sign (#) character, or the end of the DID URL. DID path syntax is identical to URI path syntax. See § Path.
DID query The portion of a DID URL that follows and includes the first question mark character (?). DID query syntax is identical to URI query syntax. See § Query.
Note that fragment, path, and query are ONLY defined as part of the DID URL. Not part of the DID.
DID URL dereferencing The process that takes as its input a DID URL and a set of input metadata, and returns a resource. This resource might be a DID document plus additional metadata, a secondary resource contained within the DID document, or a resource entirely external to the DID document. The process uses DID resolution to fetch a DID document indicated by the DID contained within the DID URL. The dereferencing process can then perform additional processing on the DID document to return the dereferenced resource indicated by the DID URL. The inputs and outputs of this process are defined in § 7.2 DID URL Dereferencing.
Section 3.2 DID URL syntax https://www.w3.org/TR/did-core/#did-url-syntax
A DID URL is a network location identifier for a specific resource. It can be used to retrieve things like representations of DID subjects, verification methods, services, specific parts of a DID document, or other resources.
Section 3.2.1 DID Parameters
Adding a DID parameter to a DID URL means that the parameter becomes part of the identifier for a resource.
Note that the parameter affects the resource identifier, the DID URL, not the DID.
After an exhaustive search through the spec, only DIDs "resolve". DID URLs are "dereferenced".
Here are the citations that back up my point, which is that a DID URL can be resolved to a DID Document, like Section 7.2 DID URL Dereferencing:
This is not a statement about resolution. This is a statement about dereferencing. I think we are all agreed that a DID URL dereferences to a specific resource. It doesn't resolve to that resource. Rather, the DID part of the DID URL is resolved to a DID Document which can then be used to dereference to the actual resource:
This process depends on DID resolution of the DID contained in the DID URL.
It's the DID that is resolved. Not the DID URL
contentStream ... The contentStream MAY be a resource such as a DID document that is serializable in one of the conformant representations, a Verification Method, a service, or any other resource format that can be identified via a Media Type and obtained through the resolution process.
So, the DID Core spec is either internally inconsistent or tragically limiting -- the "tragically limiting" perspective says that you can use a DID URL to get a DID Document, but when you get that DID Document, the identifier isn't going to be for the resource you fetched!
It is neither.
The DID Core spec is exceptionally consistent on this issue. It's the unfortunately conflation of DIDs & DID URLs on the one hand and resolving & dereferencing on the other. Fortunately, the specification is consistent on this, but it is, understandably, a challenge to keep all of this straight.
To return to your initial example did:web:subject.example/people/jane
, I expect that you likely mis-generated the did:web DID because you are conflating the path encoded in the method-specific-id of did:web with the path part in the DID URL.
That DID URL has the following parts
{
scheme : "did",
method : "web",
method-specific-id : "subject.example",
path : "/people/jane"
}
The DID for this DID URL is did:web:subject.example
Which will resolve to a DID document at https://subject.example/.well-known/did.json
.
However, what I think you probably wanted to do was to resolve to a DID document at https://subject.example/people/jane/did.json
. To achieve that result, the DID would be did:web:subject.example:people:jane
See Example 4 https://w3c-ccg.github.io/did-method-web/#example-creating-the-did-with-optional-path as well as Section 2.5.4 Optional Path Considerations https://w3c-ccg.github.io/did-method-web/#optional-path-considerations
The resource referred to by did:web:subject.example/people/jane
is ambiguous. Hence my earlier comment that maybe you aren't familiar with the current DID generation algorithm in did:web. The did:web specification is completely silent on how to interpret the path part in a did:web DID URL. It does state clearly how to decode the method-specific-id to get a path for retrieving the DID Document, but is completely silent on how a path in the DID URL should be interpreted.
My advocacy is to use linkedResource property from IIDs and did:cosmos. Paths in did:cosmos DID URLs refer to resources defined in a linkedResource section. However, this is an exceptionally new property. I've been hoping to get a demonstrable implementation in place before adding it to the DID Spec Registries, but it is in use in the IID spec and did:cosmos and I think the IID Reference and IID Resource approach taken by the IID spec is a superior pattern for avoiding the type of confusion this Github issue has illuminated.
It is also worth noting that colons are already escaped in the did:web method-specific-id for port specification in the authority part of the encoded web URL. See Example 5 https://w3c-ccg.github.io/did-method-web/#example-creating-the-did-with-optional-path-and-port
So, Orie's solution is minimal, effective, and in-line with existing processes for other restricted characters. It's unfortunate that the percent encoding was restricted to the colon used for port specification, but it's an easy fix. Which this PR does.
No changes needed to did-core. Just some upskilling on the distinctions between DIDs / DID URLs and resolving / dereferencing.
Please move "changes to resolution discussions" to issues, and keep this PR focused on adding percent encoding.
Is this moving forward?
The reason we need to interpolate the path into the method specific identifier with colons, which then requires percent-encoding in some cases, is because core violates RFC3986 by asserting that a URI path does cannot be used to identify the primary resource.
We would be better off recognizing that as errata and leaving the handling of path under the control of the method, including the decision of whether to use it to identify the primary resource.
@gribneau This PR has not moved forward... thats why I am trying to help it along.
Lets not conflate rule change to percent encoding of port
with rule changes for percent encoding of path
...
Let's continue to discuss the path
issue, on a separate issue, so we can reduce the complexity of this PR (and a subsequent PR that might be raised for path
).
I suggest you comment on DID Core repo regarding errata for that spec, feel free to cross link here.
Looking at the the amount of pushback to this PR (with regards to percent-decoding the path part), and the fact that it's addressing a very niche case (not actually a problem, in other words), I'd like to close it. We can continue the path discussion in issue https://github.com/w3c-ccg/did-method-web/issues/52
Extracted from the discussion in PR https://github.com/w3c-ccg/did-method-web/pull/47.
The current spec contains an edge case that prevents round-trip conversion between
did:web
URIs and HTTPS URIs.Currently:
With this PR:
Note that using
:
characters in your web URL paths is not at all recommended by the authors of this spec. However, since it is allowed by current URL rules, it's important that we address this corner case.