w3c / did-core

W3C Decentralized Identifier Specification v1.0
https://www.w3.org/TR/did-core/
Other
408 stars 95 forks source link

Change the terminology from "DID URL" to "DID Locator" #218

Closed iherman closed 4 years ago

iherman commented 4 years ago

(This is a spin-off of issue #183.)

The current usage of the term "URL" may lead to problems when confronted to the world of browsers. Quoting from https://github.com/w3c/did-core/issues/183#issuecomment-587018974

I would not shy away from some bike shedding on whether the term “DID URL” is indeed the right term. I know URL has a clear meaning in IETF land but, alas!, it has lost this clear meaning on Web land: by now the standard reference to URLs in W3C specifications is the WhatWG URL Living Standard. (Let us not go into a discussion whether this is a good thing or a bad thing. We should take it as a fact of life.) The URL Living Standard defines parsing rules that, if one looks at it more closely, are in fact parsing rules for URIs and not (only) URLs. I have not checked the latest versions of the WhatWG based libraries (say, in node.js), but I would expect (I would hope!) that they would parse DID URLs as well as DIDs properly. But, if so, the term “URL” has become ambiguous and we have still the opportunity to stay clear from a possible confusion with the terms that may bite us later.

and from https://github.com/w3c/did-core/issues/183#issuecomment-587368515:

Well... the discrepancy is bigger than I thought. There is an online viewer for whatwg-url at https://jsdom.github.io/whatwg-url/. Unfortunately, though DIDs and DID URLs are parsed, the result is not exactly what we would expect. See, for example

  • DID example: the protocl (did) is recognized, but the method plus the method-specific identifier is considered as a 'path'.
  • DID with path, query and fragment: query and fragment properly recognized, path is merged with the method-specific identifier.

Looking at this reinforces my feeling that we should not call this a “DID URL”, ie, we should keep away from the “URL” term. The change can be as simple as call it a “DID Locator” instead.

See also https://github.com/w3c/did-core/issues/183#issuecomment-593046902, https://github.com/w3c/did-core/issues/183#issuecomment-593415526, https://github.com/w3c/did-core/issues/183#issuecomment-593462004.

@peacekeeper @talltree @msporny

talltree commented 4 years ago

I am in favor of switching to the term DID Locator for the reasons @iherman explains.

peacekeeper commented 4 years ago

I think it would be very unfortunate if we had to abandon the DID URL term. Everybody understands that URLs are URIs that can be deferenced. I think introducing a new term "DID Locator" would be very confusing.

But as I said before I'm not familiar with the WhatWG work and W3C politics, so if @iherman and others believe this change is necessary I'd defer to them..

OR13 commented 4 years ago

https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Identifying_resources_on_the_Web

It would seem that we are talking about URNs, why not use the term DID URN isn't it more accurate to expectations?

Related:

talltree commented 4 years ago

@OR13 While a naked DID by itself functionally serves as a URN, as soon as you add other components (other than matrix parameters), the assertion of persistence is no longer assumed. For example, if you add a path to a DID, the resource identified by that path can change, so the full DID + path (that we have been calling a DID URL) no longer has the persistence of a URN.

I am very torn about this issue because I agree with @peacekeeper that the term "DID URL" is very intuitive, even if the WhatWG standard parsing does not produce the same component outputs as conventional authority-based (//) URLs.

@iherman, can we toss this back to you to consider the comments on this thread and weigh in with your recommendation?

OR13 commented 4 years ago

https://en.wikipedia.org/wiki/Uniform_Resource_Identifier

a URL is a type of URI that identifies a resource via a representation of its primary access mechanism (e.g., its network "location")

Does this "primary access mechanism" part make sense?

Feels like maybe its better to call these things DID URIs, and not assert anything about networks or access.

https://jsdom.github.io/whatwg-url/#url=IGRpZDpleGFtcGxlOjEyMy9wYXRoP3F1ZXJ5I2ZyYWdtZW50&base=YWJvdXQ6Ymxhbms=

^ I'm not sure the parsing of this is bad enough to warrant a name change.

talltree commented 4 years ago

@OR13 The problem is that a naked DID is already a URI. The distinction we're trying to draw is between a naked DID—that serves as a pure URN (which is of course a type of URI)—and a DID that includes anything else that turns it into a locator. So far the term we have used for that distinction is to call the latter a DID URL.

iherman commented 4 years ago

@talltree @peacekeeper @OR13 to make it clear, I am also torn on this. And I can very well understand why we use the term DID URL so far, as @talltree just commented in https://github.com/w3c/did-core/issues/218#issuecomment-598553939. It makes sense, and it is in line with the formal specs. No doubt about that.

But... just as an example, look at the MDN text @OR13 referred to:

The most common form of URI is the Uniform Resource Locator (URL), which is known as the web address.

and the section goes on essentially equating the term URL with HTTP URL. Yes, this is wrong, but that is how the Web community see things I am afraid; that ship has sailed. Our resources, of course , can be Web resources, but they can be (in my understanding) locators to "things" that are not on the "Web" but, say, on a block chain, right? So we are not in line with the MDN view of the world.

My fear, clearly, is that we would create more confusion than needed.

I would love @philarcher to chime in on this, too. He knows more about UR*-s than I will ever do...

talltree commented 4 years ago

Thanks Ivan. +1 for @philarcher to weigh in on this terminology question.

philarcher commented 4 years ago

Oh goodness, the weight on the shoulders...

TL;DR - I prefer DID Locator for the reasons given by others.

I have more or less forced the use of the term URI in GS1, against significant and well-founded opposition. My reasoning being that, personally, I use the term URI when I want to emphasise that the string of characters, first and foremost, is an identifier and that role is independent of any network or process. In that sense, of course, DIDs as URIs is 100% consistent (as well as being factually correct). The opposition I got was simply that Web devs never use the term URI, they assume it's a typo, and say "look, it's a URL, it's got http:// and all the rest of it. We know how this works, we understand it, don't confuse things with your fancy URI speak."

Whatever the formal truth, whatever the rights, wrongs, justice and injustice in the world, in practical terms, all URLs begin with the letters http. If matrix parameters survive the current debate, then the dis-junction between a DID URL and what most people think of as URLs is even greater.

That doesn't mean we can't use URLs as a reference in the text. We can say that DID resolvers use paths, queries and fragment IDs in ways that match the way they're used in URLs - i.e. leverage the familiarity to enhance understanding. But the term URL is tightly bound to HTTP in the psyche of many (including yours truly).

The barrier to entry - by which I mean the barrier to understanding SSI, DIDs and all the rest of it - is high (I'm no more than about a third of the way up that barrier myself). It becomes higher still if we use a familiar term with familiar behaviours to mean something other than people already understand by that term.

Incidentally, if the WG does decide to use the term DID Locator, I expect that it will quickly be abbreviated to DID-Loc. Thus we might look forward to an ad-hoc DID Doc found at a specified DID Loc.

iherman commented 4 years ago

Oh goodness, the weight on the shoulders...

😁

selfissued commented 4 years ago

I would be opposed to changing terminology from DID URL to DID Locator. People know what a URL is - a dereferenceable URI. They won't know what a Locator is.

Let's not invent new terminology when there's already standard terminology that applies.

iherman commented 4 years ago

People know what a URL is - a dereferenceable URI.

I am sorry, but I do not agree with that statement. For many people, URL is an HTTP(S) URL and only that, and they do not even have any idea what a URI is.

selfissued commented 4 years ago

For many people, URL is an HTTP(S) URL and only that, and they do not even have any idea what a URI is.

Even fewer people will know what a Locator is.

OR13 commented 4 years ago

I'd prefer the term DID URI, for most of the reasons @philarcher mentioned... I'm not sufficiently convinced that did:example:123... will be dereferenceable any more than https://www.nytimes.com/ is currently in China....

Consider especially the case where the did method is built on IPFS / Bitcoin / Not Allowed on My Network Software....

I'm not trying to pick on China... My point is that "dereferenceable" is not really universal... and since most of the cases where it works are network related... we should expect that to worsen if we start building on untrusted network software....

We are mostly talking about identifiers that a very limited set of people know what to do with... URI is close enough to URL to encourage people to learn the difference, we won't fix a bad name choice, by encouraging further abuse :)

See also: https://www.w3.org/TR/cooluris/

Of course the concept of support for the semantic web might no longer exist here, and certainly we're sure to see DID Methods that are not represented in Linked Data format now that we have opened that door...

iherman commented 4 years ago

@OR13, the DID, i.e., did:example:123456 is indeed an identifier, so it is a URI. Just like a URN or some other, less known (but official) schemes are like, say, info:... or data:// are. Ie, saying DID URI does not add any info to all this. (And using the term DID URI is actually correct as of today.)

The problem is the "thing" of the form did:example:123456/some/path#and_fragment. That is considered to be dereferencable indeed, and in this respect it is a distinct scheme (it has a very different behavior!) as a pure DID. The naming of this "thing" is the question.

We could go as far as asking ourselves whether all those structures are indeed necessary or not in practice, but I let this discussion to those who know more about DIDs than I do.

iherman commented 4 years ago

I have a very early draft for a presentation on DIDs, where I have two slides on URIs and DIDs in that respect:

the second neatly shows how a (pure) DID fits the overall picture of URIs. At the moment I have difficulties to fit the DID Locators/URLs into the very same framework, though.

OR13 commented 4 years ago

I'm not sure I could write a PR based on the current thread... and I would expect it to not reach consensus based on what I am reading.

Which is preferred?

  1. DID URL
  2. DID URI
  3. DID Locator

I prefer them in the order I wrote them.

iherman commented 4 years ago

@OR13

Among the three options I believe "DID URI" is wrong, ie, not a valid option. The current DID (did:ex:123456) is a URI, and that is great. What we are talking about is not that URI but the "thing" of the form did:ex:123456/some/path?with=query#fragment. That is a different animal than the DID URI.

I believe that, looking back at the thread, there is no real consensus on what the decision should be, i.e., no PR is possible...

talltree commented 4 years ago

@iherman is right that the term we are talking about in this issue is NOT the term for a "naked DID". A naked DID is by definition a URI since our charter requires ANY identifier using the DID scheme to be compatible with the URI spec, RFC 3986. So we MUST use term "DID URI" as our umbrella term, i.e., the term that refers to BOTH a naked DID and a DID that has additional components allowed by RFC 3986.

To illustrate the difference between these two types of DID URIs, first, here's an example of a naked DID:

did:example:12345678abcd

Here's examples of a naked DID + some additional component(s) allowed under RFC 3986 (all "rooted" in the naked DID above):

did:example:12345678abcd/path did:example:12345678abcd#fragment did:example:12345678abcd?query did:example:12345678abcd/some/path#fragment

The reason that the WhatWG URL parser does not recognize and parse these the same way as ordinary HTTP/S URLs is that the don't have // after the scheme name.

So it seems we sit on the horns of a dilemma:

  1. We continue to use the term "DID URL" and run the risk that developers will try to treat it (parse it) like a conventional HTTP/S URL.
  2. We introduce a new term like "DID Locator" and run the risk that developers and others will just not understand what we are talking about.

I must admit that, of these two, I actually think the second is the bigger risk. Here's why:

So my gut is that we should:

  1. Stick with the term "DID URL".
  2. Add a full callout note to the spec that explains what @iherman explains in his opening post, i.e., that a DID URL does not parse as a developer might expect.
philarcher commented 4 years ago

Almost as an aside, I've heard TimBL field the question "if you could go back in time, what would you do differently". The answer I heard was that he regrets the inclusion of both : and // after the scheme. Either one would have been sufficient, both is overkill.

iherman commented 4 years ago

At this point, I have the impression that we're repeating the same arguments all along. And I also believe that this is not a technical nor overly important issue; the real issue was a clearer separation of the DID ("naked DID", as @talltree called it) and the DID URL/Locator. This separation is now done in the spec.

I would propose the chairs (cc: @burnburn @brentzundel) should call a vote on, e.g., on one of the upcoming calls or even through other means (e.g., mails, github, whatever), and we go with the results. I do not believe that this issue is an essential technical one, meaning that I do not think there should be a unanimity (putting another way, somebody voting -1 should not be seen as a formal objection), meaning that a simple majority of the votes would be enough to carry it.

talltree commented 4 years ago

Since a thumbs-up on his post doesn't feel like enough, let me second @iherman 's suggestion that we just ask the chairs hold a vote (or conduct one-person-one-vote poll, as that would be easier to reach the full WG) to close on this issue. That's often the only way to close on terminology issues like this.

jonnycrunch commented 4 years ago

@iherman

I have a very early draft for a presentation on DIDs, where I have two slides on URIs and DIDs in that respect:

I would just add that I believe a DID is a subset of a URN and not a URI in your venn diagram.

kdenhartog commented 4 years ago
I have a very early draft for a presentation on DIDs, where I have two slides on URIs and DIDs in that respect:

I would just add that I believe a DID is a subset of a URN and not a URI in your venn diagram.

uri is correct. A urn must begin with the scheme "urn" according to the ABNF in RFC 8141

iherman commented 4 years ago

This issue was discussed in a meeting.