Change the terminology from "DID URL" to "DID Locator"

iherman commented 4 years ago

(This is a spin-off of issue #183.)

The current usage of the term "URL" may lead to problems when confronted to the world of browsers. Quoting from https://github.com/w3c/did-core/issues/183#issuecomment-587018974

I would not shy away from some bike shedding on whether the term “DID URL” is indeed the right term. I know URL has a clear meaning in IETF land but, alas!, it has lost this clear meaning on Web land: by now the standard reference to URLs in W3C specifications is the WhatWG URL Living Standard. (Let us not go into a discussion whether this is a good thing or a bad thing. We should take it as a fact of life.) The URL Living Standard defines parsing rules that, if one looks at it more closely, are in fact parsing rules for URIs and not (only) URLs. I have not checked the latest versions of the WhatWG based libraries (say, in node.js), but I would expect (I would hope!) that they would parse DID URLs as well as DIDs properly. But, if so, the term “URL” has become ambiguous and we have still the opportunity to stay clear from a possible confusion with the terms that may bite us later.

and from https://github.com/w3c/did-core/issues/183#issuecomment-587368515:

Well... the discrepancy is bigger than I thought. There is an online viewer for whatwg-url at https://jsdom.github.io/whatwg-url/. Unfortunately, though DIDs and DID URLs are parsed, the result is not exactly what we would expect. See, for example

DID example: the protocl (did) is recognized, but the method plus the method-specific identifier is considered as a 'path'.

DID with path, query and fragment: query and fragment properly recognized, path is merged with the method-specific identifier.

Looking at this reinforces my feeling that we should not call this a “DID URL”, ie, we should keep away from the “URL” term. The change can be as simple as call it a “DID Locator” instead.

@peacekeeper @talltree @msporny

talltree commented 4 years ago

I am in favor of switching to the term DID Locator for the reasons @iherman explains.

peacekeeper commented 4 years ago

I think it would be very unfortunate if we had to abandon the DID URL term. Everybody understands that URLs are URIs that can be deferenced. I think introducing a new term "DID Locator" would be very confusing.

But as I said before I'm not familiar with the WhatWG work and W3C politics, so if @iherman and others believe this change is necessary I'd defer to them..

OR13 commented 4 years ago

https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Identifying_resources_on_the_Web

It would seem that we are talking about URNs, why not use the term DID URN isn't it more accurate to expectations?

https://tools.ietf.org/id/draft-seantek-certspec-02.html

talltree commented 4 years ago

@OR13 While a naked DID by itself functionally serves as a URN, as soon as you add other components (other than matrix parameters), the assertion of persistence is no longer assumed. For example, if you add a path to a DID, the resource identified by that path can change, so the full DID + path (that we have been calling a DID URL) no longer has the persistence of a URN.

I am very torn about this issue because I agree with @peacekeeper that the term "DID URL" is very intuitive, even if the WhatWG standard parsing does not produce the same component outputs as conventional authority-based (//) URLs.

@iherman, can we toss this back to you to consider the comments on this thread and weigh in with your recommendation?

OR13 commented 4 years ago

https://en.wikipedia.org/wiki/Uniform_Resource_Identifier

a URL is a type of URI that identifies a resource via a representation of its primary access mechanism (e.g., its network "location")

Does this "primary access mechanism" part make sense?

Feels like maybe its better to call these things DID URIs, and not assert anything about networks or access.

https://jsdom.github.io/whatwg-url/#url=IGRpZDpleGFtcGxlOjEyMy9wYXRoP3F1ZXJ5I2ZyYWdtZW50&base=YWJvdXQ6Ymxhbms=

^ I'm not sure the parsing of this is bad enough to warrant a name change.

talltree commented 4 years ago

@OR13 The problem is that a naked DID is already a URI. The distinction we're trying to draw is between a naked DID—that serves as a pure URN (which is of course a type of URI)—and a DID that includes anything else that turns it into a locator. So far the term we have used for that distinction is to call the latter a DID URL.

iherman commented 4 years ago

@talltree @peacekeeper @OR13 to make it clear, I am also torn on this. And I can very well understand why we use the term DID URL so far, as @talltree just commented in https://github.com/w3c/did-core/issues/218#issuecomment-598553939. It makes sense, and it is in line with the formal specs. No doubt about that.

But... just as an example, look at the MDN text @OR13 referred to:

The most common form of URI is the Uniform Resource Locator (URL), which is known as the web address.

and the section goes on essentially equating the term URL with HTTP URL. Yes, this is wrong, but that is how the Web community see things I am afraid; that ship has sailed. Our resources, of course , can be Web resources, but they can be (in my understanding) locators to "things" that are not on the "Web" but, say, on a block chain, right? So we are not in line with the MDN view of the world.

My fear, clearly, is that we would create more confusion than needed.

I would love @philarcher to chime in on this, too. He knows more about UR*-s than I will ever do...

talltree commented 4 years ago

Thanks Ivan. +1 for @philarcher to weigh in on this terminology question.

philarcher commented 4 years ago

Oh goodness, the weight on the shoulders...

TL;DR - I prefer DID Locator for the reasons given by others.

I have more or less forced the use of the term URI in GS1, against significant and well-founded opposition. My reasoning being that, personally, I use the term URI when I want to emphasise that the string of characters, first and foremost, is an identifier and that role is independent of any network or process. In that sense, of course, DIDs as URIs is 100% consistent (as well as being factually correct). The opposition I got was simply that Web devs never use the term URI, they assume it's a typo, and say "look, it's a URL, it's got http:// and all the rest of it. We know how this works, we understand it, don't confuse things with your fancy URI speak."

Whatever the formal truth, whatever the rights, wrongs, justice and injustice in the world, in practical terms, all URLs begin with the letters http. If matrix parameters survive the current debate, then the dis-junction between a DID URL and what most people think of as URLs is even greater.

That doesn't mean we can't use URLs as a reference in the text. We can say that DID resolvers use paths, queries and fragment IDs in ways that match the way they're used in URLs - i.e. leverage the familiarity to enhance understanding. But the term URL is tightly bound to HTTP in the psyche of many (including yours truly).

The barrier to entry - by which I mean the barrier to understanding SSI, DIDs and all the rest of it - is high (I'm no more than about a third of the way up that barrier myself). It becomes higher still if we use a familiar term with familiar behaviours to mean something other than people already understand by that term.

Incidentally, if the WG does decide to use the term DID Locator, I expect that it will quickly be abbreviated to DID-Loc. Thus we might look forward to an ad-hoc DID Doc found at a specified DID Loc.

iherman commented 4 years ago

Oh goodness, the weight on the shoulders...

😁

selfissued commented 4 years ago

I would be opposed to changing terminology from DID URL to DID Locator. People know what a URL is - a dereferenceable URI. They won't know what a Locator is.

Let's not invent new terminology when there's already standard terminology that applies.

iherman commented 4 years ago

People know what a URL is - a dereferenceable URI.

I am sorry, but I do not agree with that statement. For many people, URL is an HTTP(S) URL and only that, and they do not even have any idea what a URI is.

selfissued commented 4 years ago

For many people, URL is an HTTP(S) URL and only that, and they do not even have any idea what a URI is.

Even fewer people will know what a Locator is.

OR13 commented 4 years ago

I'd prefer the term DID URI, for most of the reasons @philarcher mentioned... I'm not sufficiently convinced that did:example:123... will be dereferenceable any more than https://www.nytimes.com/ is currently in China....

Consider especially the case where the did method is built on IPFS / Bitcoin / Not Allowed on My Network Software....

I'm not trying to pick on China... My point is that "dereferenceable" is not really universal... and since most of the cases where it works are network related... we should expect that to worsen if we start building on untrusted network software....

We are mostly talking about identifiers that a very limited set of people know what to do with... URI is close enough to URL to encourage people to learn the difference, we won't fix a bad name choice, by encouraging further abuse :)

Of course the concept of support for the semantic web might no longer exist here, and certainly we're sure to see DID Methods that are not represented in Linked Data format now that we have opened that door...

iherman commented 4 years ago

@OR13, the DID, i.e., did:example:123456 is indeed an identifier, so it is a URI. Just like a URN or some other, less known (but official) schemes are like, say, info:... or data:// are. Ie, saying DID URI does not add any info to all this. (And using the term DID URI is actually correct as of today.)

The problem is the "thing" of the form did:example:123456/some/path#and_fragment. That is considered to be dereferencable indeed, and in this respect it is a distinct scheme (it has a very different behavior!) as a pure DID. The naming of this "thing" is the question.

We could go as far as asking ourselves whether all those structures are indeed necessary or not in practice, but I let this discussion to those who know more about DIDs than I do.

iherman commented 4 years ago

I have a very early draft for a presentation on DIDs, where I have two slides on URIs and DIDs in that respect:

the second neatly shows how a (pure) DID fits the overall picture of URIs. At the moment I have difficulties to fit the DID Locators/URLs into the very same framework, though.

OR13 commented 4 years ago

I'm not sure I could write a PR based on the current thread... and I would expect it to not reach consensus based on what I am reading.

Which is preferred?

DID URL
DID URI
DID Locator

I prefer them in the order I wrote them.

iherman commented 4 years ago

@OR13

Among the three options I believe "DID URI" is wrong, ie, not a valid option. The current DID (did:ex:123456) is a URI, and that is great. What we are talking about is not that URI but the "thing" of the form did:ex:123456/some/path?with=query#fragment. That is a different animal than the DID URI.

I believe that, looking back at the thread, there is no real consensus on what the decision should be, i.e., no PR is possible...

talltree commented 4 years ago

@iherman is right that the term we are talking about in this issue is NOT the term for a "naked DID". A naked DID is by definition a URI since our charter requires ANY identifier using the DID scheme to be compatible with the URI spec, RFC 3986. So we MUST use term "DID URI" as our umbrella term, i.e., the term that refers to BOTH a naked DID and a DID that has additional components allowed by RFC 3986.

To illustrate the difference between these two types of DID URIs, first, here's an example of a naked DID:

did:example:12345678abcd

Here's examples of a naked DID + some additional component(s) allowed under RFC 3986 (all "rooted" in the naked DID above):

did:example:12345678abcd/path did:example:12345678abcd#fragment did:example:12345678abcd?query did:example:12345678abcd/some/path#fragment

The reason that the WhatWG URL parser does not recognize and parse these the same way as ordinary HTTP/S URLs is that the don't have // after the scheme name.

So it seems we sit on the horns of a dilemma:

We continue to use the term "DID URL" and run the risk that developers will try to treat it (parse it) like a conventional HTTP/S URL.
We introduce a new term like "DID Locator" and run the risk that developers and others will just not understand what we are talking about.

I must admit that, of these two, I actually think the second is the bigger risk. Here's why:

Getting the market to adopt a new term (due to what many will think is a technical detail) is hard. The term "URL" is already well established.
Developers will learn quickly (and we can teach them in the spec) that a DID URL does not parse exactly like an HTTP/S URL.
A DID is already significantly different that an HTTP/S URL, so it is understandable that it might parse differently.
Many people will call it a "URL" or a "DID URL" no matter what we call it.

So my gut is that we should:

Stick with the term "DID URL".
Add a full callout note to the spec that explains what @iherman explains in his opening post, i.e., that a DID URL does not parse as a developer might expect.

philarcher commented 4 years ago

Almost as an aside, I've heard TimBL field the question "if you could go back in time, what would you do differently". The answer I heard was that he regrets the inclusion of both : and // after the scheme. Either one would have been sufficient, both is overkill.

iherman commented 4 years ago

At this point, I have the impression that we're repeating the same arguments all along. And I also believe that this is not a technical nor overly important issue; the real issue was a clearer separation of the DID ("naked DID", as @talltree called it) and the DID URL/Locator. This separation is now done in the spec.

I would propose the chairs (cc: @burnburn @brentzundel) should call a vote on, e.g., on one of the upcoming calls or even through other means (e.g., mails, github, whatever), and we go with the results. I do not believe that this issue is an essential technical one, meaning that I do not think there should be a unanimity (putting another way, somebody voting -1 should not be seen as a formal objection), meaning that a simple majority of the votes would be enough to carry it.

talltree commented 4 years ago

Since a thumbs-up on his post doesn't feel like enough, let me second @iherman 's suggestion that we just ask the chairs hold a vote (or conduct one-person-one-vote poll, as that would be easier to reach the full WG) to close on this issue. That's often the only way to close on terminology issues like this.

jonnycrunch commented 4 years ago

@iherman

I have a very early draft for a presentation on DIDs, where I have two slides on URIs and DIDs in that respect:

I would just add that I believe a DID is a subset of a URN and not a URI in your venn diagram.

kdenhartog commented 4 years ago

I have a very early draft for a presentation on DIDs, where I have two slides on URIs and DIDs in that respect:
I would just add that I believe a DID is a subset of a URN and not a URI in your venn diagram.

uri is correct. A urn must begin with the scheme "urn" according to the ABNF in RFC 8141

iherman commented 4 years ago

This issue was discussed in a meeting.

RESOLVED: close issue 218
View the transcript
straw poll - issue 218
Daniel Burnett: https://github.com/w3c/did-core/issues/218
Daniel Burnett: manu could you describe the options of this issue to make sure no recent consensus is missed
Manu Sporny: there’s 3 options please jump in and correct if I’m wrong
… we’re trying to decide what we call terminology (e.g. did-url did-locator did-uri etc)
… we need to call the thing where we tack on parameters something other than the uri
Ivan Herman: I raised the issue because while what we have in the document is correct the harsh reality is the term url has adopted more colloquial usage
… the term did-url may become a source of confusion because of this
… if we want to take seriously that dids could be used in the web I could see confusion coming up
… we may want to think of renaming this to avoid this issue
… on the issue I proposed a strawpoll it’s not a matter of formal objection
Manu Sporny: +1 to strawpoll to make a decision
… I looked at the WHAT-WG definition of a URL
… the one main advantage we may have is that we clarify URL by prefixing with “did”
Brent Zundel: +1 to manu
Manu Sporny: this assists with creating the distinction and is the reason that I don’t think we’re necessarily in trouble with the WHAT-WG definition of the URL
… I looked into if we could fit into the WHAT-WG definition as well
Tobias Looker: my point was similar to manu… I think we need to decide if the prefix is going to assist with association then it’s worth sticking with otherwise we should consider other names
Daniel Burnett: The idea that WHAT-WG gets to define what a URL is a bit crazy. If you were in IETF they would reject that and I think we should not go there.
Tobias Looker: +1 to burn’s point
Markus Sabadello: Yeah I agree with what others have said. metions about authority components and how they affect uris and urls
… our DID URLs don’t have a // double-slash, therefore the method-name and method-specific-id are parsed as the first segment of a path, rather than parsed as an “authority” component. (i’m not proposing to change that!)
Phil Archer: I agree with a lot of people are saying. I’ve come to the view that did-url is the best we’ll get. I wish it were otherwise but I accept the world as the way it is.
… if we are careful to always use did-url I think it’s ok
Daniel Burnett: Straw poll: DID URL or DID Locator
Manu Sporny: +1 to DID URL
Amy Guy: +1 DID URL
Brent Zundel: DID URL
Kyle Den Hartog: +1 DID URL
Daniel Burnett: +1 to DID URL
Phil Archer: +1 to DID URL
Ivan Herman: +1 to Locator
Markus Sabadello: +1 to DID URL
Tobias Looker: +1 to DID URL
Daniel Burnett: strawpoll is fairly clear, but I will ask again on larger group call
Ivan Herman: combining the minutes and the comments I think we can decide to go with did-url
… since I raised it I’m ok with closing the issue
Daniel Burnett: since you raised the issue, I think it’s fine to close this
Resolution #1: close issue 218
Phil Archer: phila: Would like to record that I stressed the importance of always saying ‘DID URL’ and never just URL
Ivan Herman: and if someone objects then we can reopen the issue

w3c / did-core

Change the terminology from "DID URL" to "DID Locator" #218