w3c / did-core

W3C Decentralized Identifier Specification v1.0

https://www.w3.org/TR/did-core/

Other

395 stars 93 forks source link

Can we nuance our mental model on DID control slightly? #233

Closed dhh1128 closed 4 years ago

dhh1128 commented 4 years ago

PR #213 has generated an interesting comment stream, and I think some useful clarity. I am happy to have multiple smart people agree in writing to the concept that a DID can identify anything, because this flexibility seemed to have been excluded by some verbiage I was hearing.

Now I'd like to explore a subtlety around the concept of control. I will frame this in terms of a use case that I'm familiar with in cybersecurity and malware research, but I think you'll quickly see how it might apply to use cases brought up by others.

Malware researchers typically identify malware (viruses, worms, infected or malicious files) by a sha256 hash. The first time a particular sample is seen in the wild, a researcher hashes the sample and goes to virustotal.com or some similar site to see if anybody else has seen it before. If no, the sample is uploaded to the site's DB for all the world to look at. If it is already known, then the researcher has just made a second (or a third, or a tenth) independent discovery.

Now, suppose I wrote a DID method that was all about identifying malware with DIDs. The logical identifier format would be did:mymethod:hash-of-sample. With me so far?

Okay, now what are the control semantics?

What I have heard so far is that DIDs are always created by a controller, who can then (even in the genesis DID doc) choose to retain control or give it away (e.g., by specifying no control after the creation transaction). This makes sense for many situations.

However, that doesn't quite fit this scenario, because A) the researcher who reports the malware is never, at any time, in a "control" relationship with the sample's identifier, and would not want to be considered so; B) the identifier cannot have control semantics, even at its genesis transaction, because its derivation mechanism disallows it; C) the identifier doesn't have a DID doc. What's being identified here is content that exists, that is explicitly uncontrollable to begin with. Anybody who discovers the content will discover the same identifier. Two researchers could register the same content on two different systems of record and both would be equally valid and not in conflict.

So my question is this:

Would we be comfortable saying that DIDs can be used to identify such things, too? And if yes (which I hope is an easy answer), are we willing to not describe such a scenario as "the controller creates the DID" but rather "the DID identifies something inherently uncontrollable, so it never has a controller, even during creation; rather, it has a discoverer" (or something to that effect)?

peacekeeper commented 4 years ago

We already have:

Magnet Links (Example: urn:sha1:xxxx)
IPFS addresses (Example: ipfs://xxxx)
Hash URIs (Example: hash://sha256/xxxx)
RFC6920 - Naming Things with Hashes (Example: ni:///sha-256;xxxx)

The last item in that list says in the Abstract: This document defines a set of ways to identify a thing (a digital object in this case) using the output from a hash function.

I agree that it would be possible to come up with DID methods that define the CREATE and READ operations in very creative ways, and by all means, go ahead. I'm just not sure what the use of DIDs really adds, especially if you don't need controllers and DID documents. You can achieve everything with plain URNs or other identifiers that are much simpler than DIDs, while still being interoperable (since everything is a URI).

BTW as a side note, if I understood @dhh1128 's original argument correctly, I think this thread is not only about DIDs that are cryptographically derived from the subject, but also - more broadly - about DIDs that are objectively observable from the subject in other ways, e.g. atomic numbers of elements in the periodic table. I look forward to reading the DID method specifications and how they define the CREATE operations. How are the DIDs for elements in the period table created? Should we make CREATE optional and argue that some DIDs may just "have always existed and are discovered rather than created"? Or did God invoke the CREATE operation for those DIDs? How "decentralized" is that? :)

The more I think about this, the less concerned I am, so please feel free to go ahead!

Just one request regarding did:immutable: I think the method name is not ideal, since it may be misunderstood to mean that these DIDs always refer to the same subject and cannot be reassigned to other subjects. But this is actually a property of all DIDs.

ewelton commented 4 years ago

@peacekeeper I do not think that "elements" work, as they are not immutable digital assets, but rather are "objects in the physical world". I believe that mapping those objects makes sense to always have a point of reference - e.g. a custodial owner.

In other words NIST and CERN might define periodic tables of elements - but there is no way for them to agree upon using an inviolate identifier, although they can provide some "claim" that says "NIST claims that what NIST calls Vibranium is the same thing as what CERN calls Vibranium" - as in

did:<nist> asserts that did:<vibranium-1>, controlled by did:<nist>, is 'the same thing as' did:<vibranium-2>, controlled by did:<cern>

this is a claim that can be refuted when it turns out that their values for the Atomic Weight differ by 0.001 units, and therefore may actually be two independent things, or it can be held "as equivalent" in so far as it is useful.

The downside is that there is no way to talk about "Vibranium" without positing a semantic frame of reference. It may seem that the word works, but "Vibranium" is not deterministically resolvable into anything. I think this is a fundamental property of linking the digital and physical world. Truly immutable objects only exist in the digital world - and when they cease to exist they do so wholesale, not through decay and mutation, which changes the hash.

So I believe that there is a deep requirement that the numeric component is derivable from the content, and as you indicate, there are several options available when dealing with these creatures. There is no compelling reason to push them into DIDs-of-today. However, a while back there was a conflict-free pathway to including this sort of uniquely digital object under the umbrella of did resolution. This was something that is not possible with URNs, but is possible with some other frameworks on the list. Thus, if this were this a year earlier, we could reasonably have talked about including a common infrastructure for resolution which would unify the multiple strategies you list, such that resolve(did) was all you ever did - hiding the complexity of 'method of access' along the same lines as we do with did methods today.

However - I absolutely agree that the window for capturing these items in this version of the W3C DID spec has passed - except through some strategy like @kdenhartog suggested, which is interesting in so far as it plays within the rules defined by the central DID authority. ;)

When it comes to a name - I definitely have no strong love of the term immutable - but I am not sure what the correct name is because there is a risk of conveying the idea that they could refer to physical objects - like elements in the periodic table, or abstractions like center of mass, or even numbers - and that would be incorrect. Only objects for which a hash (e.g. urn:multihash:) could be calculated would qualify - because there is a requirement that the DID and the target are deterministically and verifiably linked.

It is this property which can not hold when talking about the physical world, and requires the cryptographic binding and proof of control that DIDs establish.

pknowl commented 4 years ago

@peacekeeper @ewelton This method is specific to a non-governed immutable object. Does that trigger a method name that you would be happy with? Open to suggestions but we're confined to those three words.

kdenhartog commented 4 years ago

Let me start off by saying that I don’t think the purpose of DIDs is to encroach on every other identifiers. I think there’s clearly understood concepts that we have no normative statements to enforce these concepts with and that’s what my proposal intended to show, but I didn’t do a great job conveying.

For example, I believe that did:atom:carbon is valid today as I read the spec even though I don’t think it should be. Looking at some of the rules around method specific identifier generation looks to set some mental model assumptions, but doesn’t go so far as to explicitly say this is not possible. I believe this should be changed. I think if we’re changing normative statements for this method, it should be to further constrain what method specific identifiers can be right now so that human friendly identifiers are less likely to occur. Not trying to create an all encompassing identifier because as @burnburn pointed out this is going to make it much harder to get the spec across the line. How we make that testable is going to be the hard part and that’s what I think will take philosophical questions of the abstract into concrete text that strengthens the purpose of dids.

The reason I believe we should be doing this is because I think this will be what brings teeth to the claim “decentralized”.

Additionally I think the purpose of this did method should be used to further explore and show what’s the difference between DIDs and URNs. One of the main aspects in my mind that differentiates the two would be resolvability of metadata.

Where the metadata discussion goes and the normative statements that come out of it could be another place we limit this method. If our metadata approach ends up very extensible, then that will enable more functionality in this did method. On the other hand, if we decide to heavily restrict metadata in the did document as @jandrieu has suggested, I think what I originally described will be the extent of what this did method can cover.

So to make my position clear on this, I find the idea of cryptographically bound content identifiers fitting within this spec either way. I think it fits clearly in our mental model as it stands today. What I disagree with is that we should be creating an all encompassing identifier that can do anything and everything, but as it stands today the spec allows me to do that. If others disagree with that as well we need to start adding normative statements to enforce this now. The best way to do that will be to push the boundaries with conversations like this AND with concrete proposals for changes to the spec. Otherwise it’s inevitable for someone to come along in five years and define did:atom because of some extraneous reason they decided dids were the best way to do this.

pknowl commented 4 years ago

The maximum number of letters in any other DID method on the registry is 7. On that basis, I propose that we shave off a couple of letters. I think did:immutab: works perfectly well both in meaning and visually.

kdenhartog commented 4 years ago

The maximum number of letters in any other DID method on the registry is 7. On that basis, I propose that we shave off a couple of letters. I think did:immutab works perfectly well both in meaning and visually.

I'd prefer something more along the lines of did:cid or did:hash personally. I think it doesn't cause the confusion that Markus highlighted above around the subject being mutable. However, I don't think this would support update functionality because the controller abandons control at the time of publishing if we went in the direction of my proposal.

pknowl commented 4 years ago

I like did:hash:. I think we should avoid any mention of id in the method space as that is already explicit in did:.

msporny commented 4 years ago

I really like did:hash:

Please, no... imprecise and do you see what Github turns it into, above. :)

Name it something more precise... like did:cid (content identifier) as @kdenhartog mentioned above, or did:chash (content hash)

pknowl commented 4 years ago

did:chash: (content) or did:ohash: (object) ?

Going back to the original definition ... An "object identifier" is an identifier that contains a hash of the content of an object.

In that definition, "hash of the content" is a function and the "object" is a target.

Does the DID method type usually depict a function or a target? That could help steer the final naming decision.

@ewelton may have an opinion here also. He is an expert in physical / digital convergence.

ewelton commented 4 years ago

@pknowl Actually - I'm not so sure we need to head out and name this method quite yet. This has been an interesting exercise, but I am not quite sure that a method is ready for prime time.

I still am unclear about two things:

how is control surrender enforced
how does resolution work (e.g. what is the relationship w/ an underlying registry)

I would like to see some clear use cases of these as dids, instead of using the alternatives such as those mentioned by @peacekeeper, which are adequate to the task at hand.

I also want to be clear that the intention was never to simply replace existing identifier schemes for the sake of replacement. The appeal of content-bound identifiers made sense in pre-WG DID days as part of a larger effort to support verifiable identifiers and

avoid tech-stack siloing at the application programming layer (such as "this is an IPFS project")
remove the hidden dependence on DNS for resource resolution

If there is some clarity on how this method would work in terms of resolution, and how we could guarantee that the registered object fingerprint is no longer under the control of anyone (which may not be possible using DIDs) then a method is warranted and could be valuable. Once those are answered, a name might suggest itself, and without answers to those questions I would suggest not making a DID method.

The outcome we want to avoid is adding a method to the pile that looks like it does one thing, but in fact does another. If all we do is create a "regular DID" with an additional field, that is insufficient. The method space is too large relative to what DIDs are today, and we should resist adding to that pile without a strong value proposition.

pknowl commented 4 years ago

Thanks, @ewelton . That is a sound argument but, going back to my original argument of DIDs for everything in a decentralised network which allows us to move into a synergistic future with better naming conventions and smarter identifiers, I'm keen to keep investigating.

@kdenhartog - Are you able to answer Eric's first question ...

1.) How is control surrender enforced?

@mitfik - Are you able to answer Eric's second question ...

2.) How does resolution work (e.g. what is the relationship w/ an underlying registry)?

Let's hammer that out before coming back to method naming.

Just so I don't have to scroll back later on, can someone also give me a definitive answer on whether a DID method type should depict a function or a target? Thanks.

jandrieu commented 4 years ago

We've said a lot of words here. I have tried to keep this brief (and failed). HOWEVER, I am responding with a different illustration of what I see as the defining mismatch between content-based identifiers and DIDs.

This thread has shifted my sense of how we communicate what a DID is. Regardless of whether was adopt this new kind of DID as something we, as a standards effort want to incorporate, we should definitely update the language in the spec so the mismatch can be minimized for future readers. People have a hard time understanding how DIDs do what they do, which is vital to understand if they are appropriate for a given reader's needs. However the technical questions resolve, we definitely have a documentation problem.

Here's what clicked for me as I was trying to understand how we are talking past each other.

DIDs are a framework for cryptographically proving control over an identifier without relying on a trusted third party.

This is what's new. This is what's different.

This proposal to "nuance" our mental model abandons that and would create a new class of DID which is essentially uninteroperable with other DIDs. I'll call these CIDs for content identifiers, which have all the characteristics described by others. As I've stated many times, they sound awesome. They will be useful. It makes sense to standardize a way to use them.

Consider the use cases document: https://w3c.github.io/did-use-cases/

First, two of the first four essential characteristics of DIDs are not met by CIDs:

crytopgraphically verifiable: it should be possible to prove control of the identifier cryptographically;

resolvable: it should be possible to discover metadata about the identifier.

3 is not met because the hash provides NO way to demonstrate control. It only demonstrates knowledge of the associated content.

4 is not met because there is no derivable meta-data about the identifier. A CID has no mechanism to lead you to additional details that would allow the core functionality that define DIDs. In particular, there is no way to bootstrap a control framework just from a hash.

Maybe I'm missing something on #4, but to my understanding, revealed knowledge cannot establish control in the way that secret knowledge can. If you must reveal the knowledge to satisfy the cryptography, as you do with hashes, you cannot prove anything cryptographically without ceding equivalent control to the recipient of the proof. It's a leaks control and therefore isn't suitable as a control framework.

Second, of the 13 actions enabled by DIDs, only the first two are supported by CIDs:

3.1 Create 3.2 Present 3.3 Authenticate 3.4 Sign 3.5 Resolve 3.6 Dereference 3.7 Verify Signature 3.8 Rotate 3.9 Modify Service Endpoint 3.10 Forward / Migrate 3.11 Recover 3.12 Audit 3.13 Deactivate

CIDs can't be used to Authenticate, Sign, Resolve, Dereference, Verify Signature, Rotate, Modify Service Endpoint, Forward / Migrate, Recover, Audit, or Deactivate.

Third, the reason DIDs are useful in decentralized identity is precisely because of the ability to demonstrate control. Not because they identify only a particular class of thing or because they can disambiguate anything.

(FWIW, even @dhh's second definition of disambiguate wrt Alice's definition is unknowable and unprovable. Because people other than Alice can use the DID as a subject without getting confirmation from Alice that they are using it in the way that she means it. And even if they did, there is still the risk of semantic drift as Alice's sense of what she means evolves over time.)

The way DIDs bootstrap digital identity, in the most typical use case where Subject==Holder==Controller (whether or not the issuer is identified by DID) is as follows:

Two stages.

First, you get the credential. Stage I

You onboard at an issuer--they first prove who you are to their satisfaction.
You prove control over a given DID (often called DID-AUTH) using the secret associated with the cryptographic material specified in the DID Document
The issuer generates a VC with that DID as the subject and gives it to you, signing it in a provable manner.

Second, you use the credential. Stage II

A Verifier presents a challenge in a request for a credential
You construct a Verifiable Presentation which includes both the challenge and the VC, signed by the same secret material used to prove control in Stage I.2
The Verifier checks that the holder and subject are identified by the same DID.
The Verifier checks that the presentation (with the challenge) is signed with secret material indicated by implication in the DID document. Most commonly, the VP is proven to be signed by a private key that matches a public key in the DID Document.
The Verifier checks that the signature of the credential matches the known cryptographic material from the issuer (this can be from a DID Document or from any other pre-arranged mechanism to exchange keys or the like).

At this point, the Verifier knows that the current presenter of the VC has proven control over the same secret information as the subject, and therefore, with a specific level of assurance they can accept that the current presenter is one of the following:

the Subject
a delegate of the subject with cryptographic authorization (someone who has control over a proof mechanism listed in the authentication section of the DID Document or who simply has been given the private keys of the Subject for this purpose)
a bad actor who has compromised the keys (or proof mechanism) of the Subject

We always have to allow for #3. That's the weakness in the system. However, the entirety of modern cryptography has this weakness, which is why keys MUST be kept secret if they are to have any use whatsoever.

It is the ability to perform this proof of control that ties the issuance of a VC to its presentation so that a Verifier can have some proof that the party presenting the credential is, in fact, the entity given that credential, which to the best knowledge of the issuer was believed to be the subject of that credential.

You could, of course, use a third party to demonstrate proof of control. You just ask Facebook who they believe is the current presenter. They'll use their own authentication approach then present their result. The whole point of DIDs is to enable this sort of bootstrapping of verifiability WITHOUT relying on the likes of Facebook. That's what makes DIDs unique and valuable.

CIDs can't be used in this fashion. As such, they just don't do--CAN'T DO--the fundamental thing that DIDs were created to do.

Yes, we can attempt to interpret the "decentralized" part of the DID name in the hope of supporting all the kinds of identifiers that can be rigorously created without a trusted third party, but, when we can't even agree on the meaning of the word "decentralized", that seems like a particular kind of madness. No offense to @dhh1128 @pknowl @ewelton or any other proponents of this idea. It's just that shoehorning an incompatible, non-interoparable notion of DIDs because of lexical similarity with an ill-defined term just doesn't stack up for me.

That said, I do like CIDs. They have been implemented as URNs in several forms from urn:hash to urn:sha. The particular variation proposed here might deserve its own namespace, such as urn:cid or perhaps if it builds on multihash, urn:multihash.

However, since

you can't use CIDs to perform proof of control to bootstrap decentralized identity in the way describe above
CIDs lack 2 of out 4 "essential" characteristics of a DID
you can't use CIDs to perform 11 of 13 actions of DIDs as captured in the Use Cases and Requirements document

I can't help but come to the conclusion that CIDs are not DIDs.

If it doesn't look like a duck and doesn't quack like a duck, it's probably not a duck.

It might be a bird. It might taste delightful when prepared in the Peking style, but it still probably isn't a duck.

ChristopherA commented 4 years ago

The did:o: identifiers would not sit in any identity registries.

There is some precedence for this. The DNS RFCs specifically exclude the .onion root domain (and a few others) from fully complying with the DNS standard. See specifically https://tools.ietf.org/html/rfc7686

-- Christopher Allen

jandrieu commented 4 years ago

I'm sorry, is the proposal here to have a did:o namespace that then has multiple methods underneath it?

For example

did:o:sha:123...
did:o:multihash:abc...
did:o:myHash:xyz

Is that was you're suggesting @pknowl?

ewelton commented 4 years ago

@jandrieu I want to clarify - I am not a proponent of adding content based identifiers into the current model of DIDs. This is because of the two reasons I enumerated - lack of solution to resolution, and no way to fully "surrender control" - and "reproducing" simple urns but calling them DIDs is silly - and besides did:o:sha:123 doesn't assist resolution at all, because it is missing location information.

One of the mistakes made in the DID model is the strange handling of resolution - DIDs contain some location information but rely on a bunch of secret hidden magic to make them resolvable. Resolution is critical, and leaving it out of scope is just part of what I consider "a long series of mistakes" beginning around mid 2019.

Current DIDs have become defined the way you define them as the result of evolution of the community. DIDs were more open to flexibility and interpretation in the past. Alternative approaches to DIDs lost out in the sea of privacy, control, and decentralization voices - and that is fine. The rubric idea became myopically focused on decentralization, so we lost most of the structure for navigating the alternatives. The use cases became focused on what I consider a niche world. The collapse of semantic flexibility meant we got onto the road of "the one true DID"

So, to be clear - I believe that there are legitimate use cases for these sorts of "non-controlled" and "verifiable" content-based identifiers. And I believe that 1 year ago would have been a great time to sweep them into DIDland so that we could build them into the resolution infrastructure. And I believe that the flexible semantics we had 1 year ago gave a very clean path to model this larger landscape inclusively and to the benefit of the global community.

However, as of today, DIDs are more focused - they are much more specific thing, and that means that a spec will be produced and we'll get some nifty tools out. It also means that I think that getting these sorts of capabilities into the DID landscape, for the goals @pknowl identifiers, might not be viable today - the window has closed and it is time to work with the DIDs we have, not the DIDs we want. Maybe there is a way to shoehorn them into the authoritative model of DIDness, but it will take a cleverer person than me to do it.

Don't get me wrong - there has been a lot of great work and thought behind DIDs-of-today - but DIDs are neither revealed truth nor natural law, they are the result of a negotiated specification that reflects the loudest and most energetic voices. Since those have focused on privacy paternalism, control, anti-correlation, and a particular interpretation of decentralization - that is what we have. I am excited to see a lot of the work that is going on, but these DIDs are just not that relevant to my use cases - there are alternatives which I can use today to deliver "improved-sovereignty" and "improved government and business processes" through the use of non-DID grounded credentials and capabilities. When DIDs are mature and in broad adoption, it will be easy to incorporate them into my world and further improve sovereignty - and I am looking forward to that.

What makes DIDs strong for some people, make them weak for others - and that is normal. What is most important is that the spec stabilizes and is released. There is always room for adaptation in the next round of specs, and via alternative specs - so I support this effort to the extent that it does not derail or retard the delivery of a clear specification - whatever it winds up saying.

pknowl commented 4 years ago

Many thanks for pointing me to that link, @ChristopherA . Very much appreciated.

@jandrieu - For our purposes, we're not interested in location, we just need to know that the content is immutable. Perhaps resolution characteristics and MIME-type would be held in the associated DID document. I would expect the did:o: namespace to be very simple ...

did:o:<hashofcontent>

For example, if a non-governed object were moved from Drive A to Drive B, the identifier should remain the same even though the location has changed.

@mitfik will certainly have some deeper insight into requirements and resolution.

pknowl commented 4 years ago

@ewelton - I'm also acutely aware that if we get the naming convention right at this stage for non-governed objects, the Semantics side of the model would remain stable despite the release of future versions of the DID specification. This is just as much about sustainability to the network going forward as it is to non-governed objects requiring a stable identifier under the DID umbrella.

ChristopherA commented 4 years ago

Actually, the precedence of allowing for some “special purpose domains” that do not need to fully adhere to the DNS RFCs is described more fully in Section 3 of RFC 6761.

https://tools.ietf.org/html/rfc6761#section-3

The .onion domain RFC https://tools.ietf.org/html/rfc7686 describes more why this top level domain meets the criteria.

I’d like to suggest that we support a similar carve out (like in RFC 6761) for how to register a “special purpose method”, but specifically do not add to our agenda to tackle specifying the nature of any such method.

This allows the did:o, etc. people to proceed with their ideas, and allows others others who do not meet the full criteria of the 1.0 standard to still be able experiment.

For could begin with registering those method that don’t support full CRUD by marking them as “special purpose method” in the registry, and the method only has to show why they qualify as such a method.

— Christopher Allen

ewelton commented 4 years ago

@ChristopherA That does seem like a particularly useful way of sorting out some of the "stranger" methods, and perhaps keeping the door open a crack for at least playing around with novel ideas. If some of those ideas catch hold, they could make it into an future version of the spec itself - but they do not have to challenge the progress achieved by focusing DIDs, and they do not need to distract by requiring additions to the use cases.

+1 !

peacekeeper commented 4 years ago

@kdenhartog

For example, I believe that did:atom:carbon is valid today [..]

I agree with your comment.

Just wanted to point out that there's an interesting difference between did:atom:carbon and did:atom:6. In the second example ("6"), the identifier is an "intrinsic property that is objectively observable" (quoting @dhh1128 here), whereas in the first example ("carbon"), that is not the case.

dhh1128 commented 4 years ago

I've gone quiet on this long thread that I started, but I wanted to say thank you to all the smart people who chimed in.

Re. the final pair of comments from @kdenhartog and @peacekeeper : yes to the distinction Markus was trying to highlight. When you have a property that is objectively observable as the basis of an identifier, and everybody knows what property to look for, then you have the interesting phenomenon that multiple observers will automatically be led to agree on the identifier for the object -- even for new objects not yet discovered. This has some very desirable benefits in a decentralized ecosystem. Perhaps Joe is right that this doesn't belong inside the DID umbrella; I'm content to let consensus rule, but just wanted to make the strongest case I could for it.

As the original opener of the issue, I am happy enough with the ensuing discussion to let it be closed now. But we can also keep it open longer if procedure or the preferences of others pushes us that way.

peacekeeper commented 4 years ago

I think for those who would like to update the mental model in ways that have been discussed in this thread, a concrete next step would be to:

Propose that the "create" operation be made optional, just like a while ago we made "update" and "deactivate" optional, OR:
Demonstrate in some draft version of a DID method spec how the "create" operation would be defined.

talltree commented 4 years ago

@peacekeeper A DID using this method-to-be-named would still have a definition of the Create operation, no? It's just that the Create operation in the DID method spec would describe the special way in which DIDs using this method are created.

RE naming, I thought the original proposal was for DIDs using this method to use the multihash format. If so, why not just call it did:multihash:.

pknowl commented 4 years ago

@talltree I'm keen to name this method type did:o:, a name that can be cast in stone unhindered by future revisions to DID specifications and methodology. An "object is an object" so why not be bold from the outset.

The other argument for sticking with the "O" method type is that there will be a huge number of these identifiers woven into the fabric of the decentralized network. 50% of all identifiers (i.e. anything non-governed within the data capture side of the model) will contain this method type. To help people digest, adopt and ultimately scale this new identifier type, users could simply refer to them as "DID-Os".

peacekeeper commented 4 years ago

+1 to did:multihash over both did:immutable and did:o. The method name should be a hint to how the DIDs are created and resolved, rather than indicating what is being identified.

I think this is another interesting aspect in this thread. Almost all DID methods I am aware of don't restrict what is being identified. This one seems to have such a restriction, i.e. it can only identify what can be hashed.

pknowl commented 4 years ago

@peacekeeper I suppose the method name should reflect how the community sees the DID space evolving. I, for one, hope that the argument for the development of did:e: (entity identifiers) and did:o: (object identifiers) will be supported by the DIDWG in the future. I'm not saying we need to get there tomorrow but, now that a light has been shone, it will be difficult to ignore.

We have a rare opportunity to name the object identifier correctly right off the bat whilst hinting at an elegant DID syntax evolution for the future. Why wait for governed identifiers to align to the methodology. If the identifier name is set to did:multihash:, it will inevitably have to be renamed to did:o: in the future.

If I'm missing something and did:multihash: will simply be easier to get over the line for DID v1.0 then I'll concede for the greater good but that shouldn't stop the DIDWG from investigating did:e:/did:o: further upstream in a bid to resolve the potential method-type scaling issue highlighted in this thread.

pknowl commented 4 years ago

@mitfik has just messaged me saying that he has a feeling that a non-governed object identifier may need to contain more than just a simple 'multihash'. On that note, I propose that the community hold off on a casting vote until the tech guys have had a chance to further investigate what identifier characteristics should be included.

ewelton commented 4 years ago

@peacekeeper

I think this is another interesting aspect in this thread. Almost all DID methods I am aware of don't restrict what is being identified. This one seems to have such a restriction, i.e. it can only identify what can be hashed.

This is critical as I see it, because it is the presence of a controller that defines the semantic space within which the identified exists. I see that as a key strength of controlled DIDs. When you and I talk about the same thing using different DIDs, the only way that can coordinate is by presenting evidence from attached and found information - external claims, credentials, and the like which are linked to the controlled document. That is very valuable, however....

The reason these were of interest was that, like urn:multihash:1234 there is a restrction on what is identified - namely that which can be hashed. It is this property that allows them nearly zero semantic ambiguity - down around 1 in 2^80 or above range - tweakable by the hash, of course. This means that we can talk about the same thing, using an identifier, without pinning it on a negotiation.

This is useful, for example, when pointing to a credential schema or context or other primitive from which one scaffolds deterministic processing in a decentralized data economy - it provides an "open authority" without simply using DIDs to create "a new root of central authority." I find the concept of a Bitcoin Anchored Semantic every bit as Centrally Controlled as schema.org.

Hashlinks give us a lot of the power needed - and in particular they give us the thing that is missing from simply using did:whatnot:<hash> - namely, hints about location and thus a pathway to resolution. What nothing gives us yet is a specification about what sort of descriptor could come back, and that definitely has value - giving programmers a coordination point that was not bound to specific implementations, but bound to the concept of uncontrolled, self-certifying identifiers.

I also remain concerned about the maintenance of hidden control - the 'create' method would effectively be a 'register' method - but register it in what infrastructure? - which gets, again, to resolution. And it is the infrastructure of the registry which defines the possibility of true "surrender of control" vs. "good samaritan waiving" - i think it makes sense to wait to name this concept until those elements are clear:

how do create/register
how does read work
how is control surrender enforced

if we can not do these, then we have defined something equivalent to regular DIDs with a claim "this DID that I control is about urn:multihash:1234" - and those DIDs are fine, but they can not be the foundation for scaffolding semantic processing on a decentralized data economy - for that we need a decentralized identifier with broader capabilities than DIDs.

kdenhartog commented 4 years ago

I think for those who would like to update the mental model in ways that have been discussed in this thread, a concrete next step would be to:
* Propose that the "create" operation be made optional, just like a while ago we made "update" and "deactivate" optional, OR:

* Demonstrate in some draft version of a DID method spec how the "create" operation would be defined.

I'd say there's probably a few things we could take from this thread as well to make as additions to the did core spec. Some of the arguments against this method have pointed to a few things that are left as tribal knowledge that I'm wondering if we could get normative, testable statements for.

For example, one of @jandrieu point I felt was a pretty strong point. On creation of a DID it SHOULD (could be upgraded to MUST) be possible to prove limited control of the identifier via a cryptographic mechanism.

Another one I've been toying around with is the idea of a minimum number of possible namespace entries. E.g. the method specific identifier must be able to identify at least 2^80 unique identifiers. I'm not sure this really adds much enforcement to the idea of the identifier not needing an authority to authorize access to the namespace.

I also like @ewelton point about adding at least non-normative statements and normative statements if possible around surrendering control because I feel that was part of the crux of what makes this possible.

@peacekeeper do you have any ideas around other things that might be worth adding for this?

kdenhartog commented 4 years ago

Thanks, @ewelton . That is a sound argument but, going back to my original argument of DIDs for everything in a decentralised network which allows us to move into a synergistic future with better naming conventions and smarter identifiers, I'm keen to keep investigating.

@kdenhartog - Are you able to answer Eric's first question ...

1.) How is control surrender enforced?

It's surrender at the point of creation by the intrinsic nature of the method. In other words, control of the knowledge is all that's necessary to create the method. Representation and proof of control is unnecessary after creation, just as it's unnecessary after all keys have been revoked in all other methods.

kdenhartog commented 4 years ago

I'm sorry, is the proposal here to have a did:o namespace that then has multiple methods underneath it?

For example
did:o:sha:123...
did:o:multihash:abc...
did:o:myHash:xyz
Is that was you're suggesting @pknowl?

I hope not, that makes the method name even more likely to centralize around a naming authority.

kdenhartog commented 4 years ago

I've gone quiet on this long thread that I started, but I wanted to say thank you to all the smart people who chimed in.

Re. the final pair of comments from @kdenhartog and @peacekeeper : yes to the distinction Markus was trying to highlight. When you have a property that is objectively observable as the basis of an identifier, and everybody knows what property to look for, then you have the interesting phenomenon that multiple observers will automatically be led to agree on the identifier for the object -- even for new objects not yet discovered. This has some very desirable benefits in a decentralized ecosystem. Perhaps Joe is right that this doesn't belong inside the DID umbrella; I'm content to let consensus rule, but just wanted to make the strongest case I could for it.

As the original opener of the issue, I am happy enough with the ensuing discussion to let it be closed now. But we can also keep it open longer if procedure or the preferences of others pushes us that way.

It looks like the author of this issue feels satisfied by the discussion that occurred. Next steps for this can go one of two ways (potentially both) I would guess. @mitfik @pknowl and I can draft a strawman did method to explore what these immutable, surrender control on creation dids would look like, or we can begin to propose language to constrain what did methods are possible.

Any opinions on which way to go?

pknowl commented 4 years ago

Thanks, @kdenhartog . I believe this is now in the capable hands of @mitfik and a couple others in the HCF tech group to start working on a strawman/draft spec. The workload has suddenly gone through the roof at this end which is why this stream has slowed down. That said, I think we have everything we need for now.

kdenhartog commented 4 years ago

I propose we close this issue then since the did method can be shared via the did method registry. Any objections?

brentzundel commented 4 years ago

No activity since marked pending close, closing.