w3c / did-core

W3C Decentralized Identifier Specification v1.0
https://www.w3.org/TR/did-core/
Other
405 stars 94 forks source link

Clarification on what DIDs might identify #199

Closed brentzundel closed 3 years ago

brentzundel commented 4 years ago

I think there is agreement in the group that DIDs may be used to identify entities such as people, agents, IoT devices, etc. What about using DIDs to identify things that are incapable of interacting using cryptographic keys? What if there is no controller property or public keys at all? Could a DID document simply contain, or point to, some content?

Could the following have DIDs?:

Apologies if this is a re-hash of an already concluded conversation. I searched but did not find.

jandrieu commented 4 years ago

There are two ideas here.

The first, yes, one should be able to use DIDs to identify subjects like the moon, e.t.c. which is especially useful for VCs.

However, using a DID document to publish information about the subject is an anti pattern i actively discourage. IMO, a DID document should only contain information needed for secure interactions with the subject. We can't prevent people adding assertions that the moon is made of cheese, but it's really a violation of the moon's privacy' so let's not encouraged that.

iherman commented 4 years ago

@jandrieu, just to clarify, though: while I agree that assertions about the details of the moon are an anti-pattern, it is acceptable to add a reference (I guess as a service) to a home page on the Web authoritatively describing the subject. (I use the 'authoritatively' in a loose sense; e.g. is to refer to my personal Web site, for example.)

dlongley commented 4 years ago

I agree with @jandrieu.

@iherman,

just to clarify, though: while I agree that assertions about the details of the moon are an anti-pattern, it is acceptable to add a reference (I guess as a service) to a home page on the Web authoritatively describing the subject.

In principle, yes, but the same PII rules apply. In my view, I think which statements can/should be made about a DID subject in a DID Document largely depend on who can resolve the DID to get the DID Document -- and the nature of the system that stores these statements, including how it "forgets" such statements when desired. These two qualifiers could perhaps be summed up by saying "it depends on the DID method".

For example, Veres One intends to disallow PII of this sort to be stored on its ledger, instead preferring to use a privacy preserving "ProxyService" endpoint. The proxy service endpoint would be expressed in the DID Document returned when resolving the DID, but it would not reveal PII about the DID subject itself. The service could, however, be used to retrieve more statements about the DID subject in a way that supports deletion and, potentially, authorization. It is here that additional services that may contain PII such as what you mentioned (home page on the Web) would be expressed. Either way, all statements about the DID subject (whether retrieved from the DID Document or an extension of it via the ProxyService) can all be collapsed into a single graph of statements about the DID subject -- and then passed to an application for consumption.

jandrieu commented 4 years ago

Yes, you can have a service endpoint to get more information, but I believe that is a flawed approach.

We can't keep people from doing it without removing service endpoints, which I do not believe is a viable option at this stage.

However, turning DIDs into a lookup mechanism for details about subjects is a fundamental privacy problem. IMO, we don't need more directories that let anyone glean information, PII or otherwise. This information WILL be aggregated and it WILL be used improperly.

In my professional opinion, the DID subject should ALWAYS be in the loop for any disclosure of information. We have zero mechanisms on the table for purpose binding and consent in the DID Document and while you could place that functionality behind a service, it still lacks the human in the loop unless, again, you create an extraordinary service endpoint far beyond what most have been discussing. Far better for individuals to ALWAYS explicitly share whatever information they choose to (and not sharing what they don't), using the DID infrastructure to verify the veracity of that information and the proof of control for authenticating as the subject.

IMO, this is the proper best practice for protecting privacy, but I may well be a minority voice in this working group.

brentzundel commented 4 years ago

I agree that a DID document should not be used to look up personal details of DID subjects. This issue is more to explore 2 ideas:

  1. Having DIDs for things that can't interact in any meaningful way using cryptographic keys, whose DID documents may not even contain public keys or controller information.
  2. Having a DID document that contains the DID subject (or is the DID subject).

I'm wondering if it makes sense for the subject of a DID to be a JSON-LD context document; essentially using the did resolution process to retrieve some content that may or may not have any public keys or controller associated with it.

jandrieu commented 4 years ago

For #1, that seems like a weird edge case. I believe all of the methods use keys to control the DID Document, which means that there ARE keys associated with the DID, whether they are in the document or not. I can create a DID for the moon and start issuing VCs either about the moon or "from" the moon. That's an interesting bit of fiction... and I look forward to the artist who takes up that task. What can never be known definitively is whether or not the controller of those keys is, IN FACT, the moon. IMO, any attempt to restrict this will be untenable. The only real option is to understand that the Subject isn't necessarily the Controller, and as such, what the Controller does may or may not legitimately represent the actions of the subject. DIDs can only ever be a single factor in what will need to be a multi-factor tapestry of identity proofing and authentication.

For 2, we have two things. If the DID Document is its own subject, we have a recursive problem similar to http range 14. What I think we probably need is the ability to specify in a DID-URL that we are referring to the DID Document instead of the DID Subject. Then you can make VCs with the DID Document URL as a distinct thing from VCs about the DID Subject (which uses the naked DID). However, I don't believe it makes sense to be self-referential in regards to the subject. Is there a use case that you can describe for this?

The second have of #2 is if the DID Document contains the DID Subject. That's also a bit weird. I think the same work-around as above would be appropriate, but I would need to understand the use case. Do you mean that a specific property (or set of properties) in the DID Document is somehow the Subject referred to by the DID? I'm struggling with out a concrete example, but I think the same DID-URL approach used for the entire document should work for part of the document, IF the scope of the Subject is describbale within the DID-URL syntax. For example, I don't currently see how you could specify multiple distinct properties within a DID Document in a single DID-URL.

Use cases would help.

dlongley commented 4 years ago

@jandrieu,

Yes, you can have a service endpoint to get more information, but I believe that is a flawed approach.

I think we're miscommunicating.

A "home page", "personal website", or "personal VC service" may contain PII in the URLs themselves. They are also "mechanisms for interacting securely with the DID subject". With the approach I outlined, DID methods that use DLT/blockchain tech can have a mechanism for discovering those service URLs (which can then be used to get access to more information with the appropriate consent, etc.) without running afoul of "the right to be forgotten" -- because the URLs themselves are not on the blockchain. Instead, a non-PII ProxyService that is on the blockchain enables you to get them from a non-blockchain service. I think this is an important and useful feature. I am not suggesting that the feature be abused to express your SSN.

jandrieu commented 4 years ago

I understand the work-around. I still think it's the wrong approach. Using a DID registry as a white pages is going to cause harm. Full stop. I'm not talking about regulatory compliance, I'm talking about the structure of available information.

Yes, people will still do it.

Yes, people will get value out of giving out a DID URL that magically moves from service to service as their life evolves.

Just like we get value out of having a public posting board at the local coffeeshop where businesses can post flyers. And who would blame individuals for wanting the same convenience?

Sacrificing your privacy for convenience happens all the time. Heck, my address has been publicly linked to my name and companies my entire adult life. I get it.

Doesn't mean I'm going to recommend it to my loved ones or clients. Nor do I think we should stop seeking an system that enables secure interactions without inviting wholesale loss of privacy.

Creating a large scale identity architecture that invites all participants to publicly link service URLs and related information to a known identifier will create a medium for exploitation. And it will be exploited.

It behooves us to find better options.

talltree commented 4 years ago

This thread has moved quickly. I want to go back to @brentzundel 's original question:

What about using DIDs to identify things that are incapable of interacting using cryptographic keys?

My answer is absolutely yes. This has been assumed about DIDs since the very first version of this spec four years ago. In short, a DID can be use to identify "anything with identity". Period.

I started the following simple diagram during our F2F meeting in Amsterdam last month:

image

IMHO, this captures everything that needs to be said about the relationship of a DID Controller, a DID Subject, and a DID Document:

So the fact that a DID identifies a DID Subject that is not capable of interacting using cryptographic keys is immaterial for several reasons:

  1. The DID Subject may be an inanimate object (say, a teapot), but it can still be represented by a digital agent that is capable of interacting and using cryptographic keys on behalf of that inanimate object (for example, an agent for the teapot that is able to provide product information, verifiable credentials about proof-of-origin, warranty service, etc.).
  2. The DID Subject may need to be identified by the DID Controller only for reference purposes and not offer any means of interaction. In this case the only purpose of cryptographic keys in the DID Document would be for the DID Controller to control updating of the DID Document.
  3. The DID Subject may need to be identified by the DID Controller for reference purposes only and have a static description that never needs to change. In this case the DID Document never needs to have any cryptographic keys at all.

All of the above (and others I've probably missed) are possible, which is why everything is optional in a DID Document except the DID (because without a DID, you're not identifying anything at all).

ADDENDUM No sooner did I finish this comment than what appeared in my email was a notification from W3C that Web of Things (WoT) Architecture is now W3C Proposed Recommendation. WoT is a perfect candidate for DIDs that identify things that may or may not have the ability to interact using cryptographic keys.

brentzundel commented 4 years ago

From Drummond's comment, number 3 is what I was looking for. I know that this is allowed by the spec (everything is optional except ID), but I wasn't sure how palatable such an idea would be.

For @jandrieu, an example of what a DID Doc might look like if it contained a JSON-LD @context as the DID Subject:

{
    "@context": [
        "https://www.w3.org/ns/did/v1", 
        "did:example:yfXPxeoBtpQABpBoyMuYYGx"
    ],
    "id": "DID:example:BmfFKwjEEA9W5xmSqwToBkrpYa3rGowtg5C54hepEVdA",
    "subject":{
        "rstype": "ctx",
        "name":"DriverLicense",
        "version":"1.0",
        "hash":{
            "type": "SHA2-256",
            "value": "a005abbfcfaf7b0d703a7fc9fb86c8b71a33a10ef24d292984fc863c225205b9"
        },
        "data":{
            "@context": [
                "did:example:UVj5w8DRzcmPVDpUMr4AZhJ",
                "did:example:JjmmTqGfgvCBnnPJRas6f8xT",
                "did:example:3FtTB4kzSyApkyJ6hEEtxNH4",
                {
                    "dct": "http://purl.org/dc/terms/",
                    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
                    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
                    "Driver": "did:example:2mCyzXhmGANoVg5TnsEyfV8",
                    "DriverLicense": "did:example:36PCT3Vj576gfSXmesDdAasc",
                    "CategoryOfVehicles": "DriverLicense:CategoryOfVehicles"
                } 
            ]
        }
    }
}
talltree commented 4 years ago

@brentzundel I'm curious about your "subject": property above. Is that a proposed generic DID document property for describing (vs. identifying) a DID Subject, or a crossover from Verifiable Credentials, or something you developed for this use case, or...?

brentzundel commented 4 years ago

It is something that was developed for this use case. If a context document may be identified by a DID, then it might make sense to allow it to be retrieved via a DID document.

On Thu, Feb 20, 2020, 20:42 Drummond Reed notifications@github.com wrote:

@brentzundel https://github.com/brentzundel I'm curious about your "subject": property above. Is that a proposed generic DID document property for describing (vs. identifying) a DID Subject, or a crossover from Verifiable Credentials, or something you developed for this use case, or...?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/199?email_source=notifications&email_token=ACPFKPZTVRFVW3JEYCIK5S3RD5EQPA5CNFSM4KXMHSZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMRM6EY#issuecomment-589483795, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPFKP7DRUAMCF2BUFVWLQTRD5EQPANCNFSM4KXMHSZQ .

talltree commented 4 years ago

@brentzundel That makes sense to me. In fact it was @dlongley who originally convinced me that not only could a DID identify anything, but a DID document could describe anything. Originally I had thought that the DID document would just include a service endpoint from which you could get a description of a DID Subject. And that certainly can make sense if the description needs to be access-controlled in some way.

But if the description is itself public information resource (like a schema or a JSON-LD context or a verifiable credential definition), then why not save a round-trip (and the extra protocol switching) and just have the DID Subject be contained within the DID document itself?

What I like about the "subject": property is that we could, in fact, make this a standard generic DID document property for any DID Subject that is an information resource contained in the DID document itself. A "type": subproperty could define the type of this information resource (in your example you use "rstype":, but we could generalize that). And "data": could be the body property that contains the actual information resource, precisely as you show in your example.

Lastly, it would be trivial for a DID resolver to be asked to return any DID Subject that was an information resource contained in a DID document using this approach.

ken-ebert commented 4 years ago

This solves the problem that I have been working on at Sovrin for over a year: how to identify immutable resources such as contexts, schemas, credential definitions, etc. in a way that is compatible with the DID resolution process. Matrix parameters were first suggested as a way to solve this problem; matrix parameters come with their own baggage. Using a DID to identify the digital resources and returning the resource within the DID document elegantly solves my use case.

talltree commented 4 years ago

@ken-ebert That's exactly what I was thinking when I first saw Brent's example. As you recall, Ken, you and I spent an entire Rebooting the Web of Trust conference in Barcelona a year ago working on how to best solve the problem of using DIDs to address immutable content. In retrospect I'm not sure why we didn't see this solution—it strikes me now as one of those "blinding flashes of the obvious".

jandrieu commented 4 years ago

@ken-ebert I don't think it solves that problem, because the DID Document is not itself immutable. You could refer to a context with a DID and then update the DID Document to contain different data.

jandrieu commented 4 years ago

@ken-ebert I think the better way to handle this is just a service endpoint with a data: url. rather than data at the top level. So users can say did:example:123;avatar and get back the png given as an avatar because the service endpoint is defined as in this gist: https://gist.github.com/jandrieu/34c5b05cef5b01f80b9088b339de9798

Of course, you would use the actual context instead of a picture of me.

This is a more generalizable approach to get what you want: storing context in the did document, without inventing anything new. Data: urls have been around a long time.

A convention about what to name that service endpoint would be useful, but sticking a data property at the top level seems like putting an http resource in your DNS record. It's conflating two different parts of the layered architecture and may not work well with the other layers.

talltree commented 4 years ago

@jandrieu Let me respectively disagree. It's not that there's anything wrong with a data: URL. I'm a fan of them when appropriate. But in that case the bare DID would not actually identify the target information resource. Only the full DID URL with the addition of the service endpoint parameter would actually identify the target information resource. And there's still the open question of what you would call that particular service endpoint parameter.

What I understand Brent and Ken are after is that the DID identifies the target information resource itself and nothing else. So the target information resource IS the DID subject.

Putting the target information resource in the DID document as the body of the subject property seems like a very simple and elegant solution for this particular use case. No parameter and no other indirection is needed: it states clearly and unambiguously in the DID document that the DID subject is an information resource that is the body of the subject property.

This approach also takes advantage of the fact that no other separate data repository is needed for the information resource. In other words, since a DID method must already be responsible for long-term persistence of DID documents, then why not also use that same persistent storage mechanism for an information resource identified by a DID?

IMHO this gets to one of the key ways that DID architecture is different than DNS architecture. DID architecture is not only decentralized instead of federated, but DID documents can be self-describing. What Brent and Ken are suggesting is one way to take advantage of that self-description: take an immutable information resource and wrap it in a DID document so it becomes permanently addressable via that DID.

kdenhartog commented 4 years ago

then why not also use that same persistent storage mechanism for an information resource identified by a DID

  1. Managing scale: Handling data directly on ledger increases the storage requirements for a node to join the network. This was one of the main reasons bitcoin core opted to not increase blocksizes. Additionally, general-purpose storage changes the usage patterns of the ledger thereby increasing the storage requirements. This is one of the reasons the Ethereum ledger requires more data to run a full (non-pruned) node than a bitcoin core node.

  2. Managing data types is made harder: If some use cases accept data to be stored on-ledger (e.g. DID of driver in @brentzundel example) whereas other use cases (e.g. putting my name in the driver) could introduce privacy and legality concerns. For the specific example I mentioned, this would require the ledger nodes to parse data being written to the to make sure privacy concerns aren't an issue. Alternatively, the TAA approach can be taken where the author asserts a legally binding contract when they author a transaction, but this solution would end up relying on some form of moderation (e.g. is this content actually a problem or not) which is inherently hard to scale globally.

jandrieu commented 4 years ago

@talltree I think the bigger issue is that the proposed addition changes the semantics of DIDs as URIs and URLs. You are essentially adding a default resource to return when a DID URL is dereferenced, which is something OTHER than the DID Document, depending on the presence of a special, optional property in the DID Document . Now you have two very different responses to dereferencing a DID, depending on the presence of a "data" element, but only if a service endpoint is also not defined, in which case, what happens remains TBD.

IMO,

  1. resolving a DID should return the full DID Document
  2. dereferencing a DID without a service part should return the full DID Document
  3. dereferencing a DID with a service part should return the resource specified by the service endpoint in the DID Document

This proposal adds another level of complexity, despite the opportunity to achieve the same phenomenon with a service endpoint. In fact, if you don't like data: URLs, add a new type of service endpoint, perhaps "resource" which will return the specified property directly.

Messing with what it means to de-reference a DID Url seems like the wrong way to get what you want. It adds unnecessary complexity without adding features that can't already be achieved.

My strong advocacy here is that we should stop trying to do more with DID Documents by jamming new content into them. Rather, we should be minimizing what is in the DID Document to make interoperability more achievable. More is less and less is more.

iherman commented 4 years ago

You are essentially adding a default resource to return when a DID URL is dereferenced, which is something OTHER than the DID Document, depending on the presence of a special, optional property in the DID Document . Now you have two very different responses to dereferencing a DID, depending on the presence of a "data" element, but only if a service endpoint is also not defined, in which case, what happens remains TBD.

I guess this is related to the separate issue which has been bugging me since the face-to-face, see https://github.com/w3c/did-core/issues/183, in particular https://github.com/w3c/did-core/issues/183#issuecomment-587018974 ...

peacekeeper commented 4 years ago

@jandrieu

  1. dereferencing a DID with a service part should return the resource specified by the service endpoint in the DID Document

This is almost correct. In fact, dereferencing a DID URL with a service part (service matrix parameter) does not return the resource at the service endpoint. Instead it returns the service endpoint URL, and then a second dereferencing process is executed on that URL.

This is comparable to how HTTP redirects work. The browser tries to dereference the first URL, but it gets another URL instead of a representation of a resource. It then dereferences that second URL.

See https://w3c-ccg.github.io/did-resolution/#dereferencing-algorithm for more details.

Oskar-van-Deventer commented 4 years ago

What is the relationship between the discussion in the present issue #199 and #190?

Oskar-van-Deventer commented 4 years ago

I took the liberty of making a pull request to clarify what a DID identifies, see #213.

"A DID identifies an entity that is capable of creating digital signatures, e.g. a natural person, a legal entity or an autonomous machine."

Would this help this discussion about "Clarification on what DIDs might identify"?

brentzundel commented 4 years ago

I've made comments in the PR and do not feel that it addresses this issue. There's nothing in the spec that prevents someone from having a DID Doc like the one here, so I'm not looking for any spec changes, but I would like more feedback on the idea. Perhaps @dlongley, @msporny, @dmitrizagidulin, @SmithSamuelM or @selfissued could chime in?

pknowl commented 4 years ago

@talltree Thanks for bringing this discussion to my attention. I'm sure @ken-ebert and @brentzundel would have brought it up in the next HL Semantics WG call for in-depth discussion. I'll put it down as an agenda item for the next Semantics call.

talltree commented 4 years ago

Thanks @pknowl. I agree that persistent and verifiable identification of semantic resources is hugely important for the entire world of trusted interactions—which is why I thought this thread had great potential. I don't know if I can attend the next HL Semantics WG call, but please send me a reminder if you can and I will do my best. I strongly urge @brentzundel, @ken-ebert, and @dhh1128 to come too.

kdenhartog commented 4 years ago

Was there any consensus on this during that call? I figured out when I mulled over this a bit more that my biggest issue is with the fact that in @brentzundel example there's no controller representing the did subject and therefore the did document can't be updated once published. While it's currently defined that "A DID can have more than one controller", we don't make any mention about what happens if the DID has no controller (e.g. it's just a resource).

This is where the language around controller becomes more important to me. Saying that a DID can be controlled by no one makes little sense to me. Isn't it just a static resource then at which point I'd ask why I need a publicKey (not required) or service endpoint (required but can be empty I believe)?

Oskar-van-Deventer commented 4 years ago

Please see PR #213. It uses the term "allows a controller ...". As @jandrieu already highlighted, DIDs can be abandoned by its controller immediately after the creation of the DID.

brentzundel commented 4 years ago

@kdenhartog Also see @talltree 's comment here. Specifically:

The DID Subject may need to be identified by the DID Controller for reference purposes only and have a static description that never needs to change. In this case the DID Document never needs to have any cryptographic keys at all.

kdenhartog commented 4 years ago

After reading through the comments in here and in #233 I can now understand where this is coming from and can see use cases that would be suited for this. I think you'll still run into implementation concerns with most ledgers by doing this, but that's up to the implementers to figure out not the spec.

jandrieu commented 4 years ago

Apropos of other conversations about did control, what DIDs do that is unique in the world is provide identifiers for which you can prove control without reliance on a trusted third party.

Content-based DIDs don't prove control. The only prove knowledge.

The defining characteristic of a DID is that someone can prove control, typically by demonstrating knowledge of a true secret. To the extent that that secret is no longer secret, proving control no longer differentiates the controller from the general public. The magic of public/private key cryptography is that you can publicly reveal information (the public key) which enables verification of certain mathematical operations performed by the private key. As long as that private key is legitimately secret, then proof of control has meaning.

This is the hook.

If all you need is to verify the content of the Subject, then you don't have a DID you just have a content-specific identifier. Useful and decentralized, but it doesn't do any of the things DIDs do wrt demonstrating proof of control.

kdenhartog commented 4 years ago

@jandrieu

I disagree about your assertion that proof of knowledge is not proof of control. Based on my understanding of @ChristopherA #122 (comment) the "ownership" key in BTCR is doing just that. It's asserting proof of knowledge of the public key at a later date by committing to that value ahead of time. Or as he states, is able to reveal the public key to assert proof of ownership. In other words, the process by which someone reveals a private key to assert the ability to derive a public key that proves knowledge of the public key which is verifiable by the hash in the DID Document (as it is today) is the same as having the secret knowledge in the first place in my opinion. It's just one less level of indirection. Mind you, I'm likely butcher my understanding of BTCR as I haven't worked with it enough, so please correct my misunderstanding of how it works if so. Preferably a link to where this is documented so I can read more if possible.

A second point, that I'd like to make is that proof of knowledge is not excluded by the definition of "verification method".

A set of parameters required to independently verify a proof, such as an identifier for a public/private key pair that would be used in the proof.

sharing the committed knowledge of the hash is "a set of parameters" and they are "required to independently verify a proof" if the hash algorithm is cryptographically secure.

I'd also like to point out that the crux of your argument that differentiates proof of knowledge commitment verification to PKI verification is not testable.

To the extent that that secret is no longer secret, proving control no longer differentiates the controller from the general public.

There's no way to know in a PKI based DID if a private key hasn't been broadcast to the world outside the DID Document, so I don't think this is a valid nuance to the mental model we should be building off of.

jandrieu commented 4 years ago

My point about secret v non-secret is that in order for the hash to be verified, the content MUST be exposed. Proof of control of private keys is precisely useful because you can prove you know the secret without revealing it.

This is the fulcrum on which all public/private key cryptography depends. It's a profoundly different architecture than the content-revealing verification that a hash matches a given piece of content.

You may be right about BTCR's use of hashes. Rotation requires proving control of the private key and I would have expected demonstrating proof of control of a BTCR would reveal the public key, but not the private key. I'll have to defer to people who understand it better.

My point is that we can prove essentially nothing about who the subject is with DIDs. All we can prove is that someone has control over a private key, which we take as either proof of control over the DID or authorization to authenticate on behalf of a DID.

From that control, we can bootstrap a system of credentials issued "to" that DID because either the controller or a delegate of the DID proved their control or delegate status, and claimed that DID is an appropriate subject for a requested credential. Then upon presentation of that credential, they can again prove control (or authenticate) and establish a link between themselves as holder and their authority to act on behalf of the DID. Thus at both creation and presentation, proof of control over the DID is what establishes appropriate use.

If no one can prove control, nor authenticate on behalf of a DID, then treating the DID as any particular subject is a fools game. Anyone can say anything about any DID. Only a controller or delegate can demonstrate they are suitable parties for interactions on that subject's behalf.

For content-based DIDs, you can't demonstrate control. The very hook that allowed a provable link between a given identifier and someone using that identifier is gone.

Yes, you could put a hash in a registry and let a trusted third party decide who gets to say things about that hash-as-identifier, but you can only ever prove knowledge of a revealed piece of content. The rest is up to that third party.

Approaches that simply link an identifier to content have no control framework, which to my mind suggest they cannot be used for the proof-of-control that is core to the DID architecture. This is the essence of the conversation about adjusting our notion of control. But it isn't just about having a deterministic link between an identifier and a piece of content, it is about using secrets to demonstrate control without relying on a third party. Content identifiers don't demonstrate control and therefore can't be used in the way DIDs can be: as a means test for proving the propriety of certain actions by an otherwise unknown actor.

IMO, proving control is the fundamental to how DIDs do what DIDs do.

Sure, let's have content-specific identifiers. That's useful. I just don't see how they enable secure interaction with anything, how they prove control. I don't see how they are DIDs in the sense that this group has been developing.

pknowl commented 4 years ago

@jandrieu - I think everyone understands what you are saying, what you've said and what your opinion at this juncture is. I think a definite line can be drawn there. At this stage, the argument is more philosophical than technological.

I strongly believe that a decentralized data network will be much more stable if we only have one identifier throughout the space - DIDs for everything. That would allow us to put all other identifier standards out to pasture with a graceful "thank you for getting us to this point."

I'm attaching a mini-deck so that everyone can visualise exactly what that means. DIDs for everything in the big blue circle. See first slide. Identifiers.pdf

Can we close this particular thread and all hop on #233 instead? The two threads are now talking about exactly the same philosophical issue.

SmithSamuelM commented 4 years ago

A digital signature is also a collision resistant hash. Using a signature as the basis for a self-certifying identifier simultaneously provides a content addressable identifier and an identifier under the provable control of an asymmetric key-pair. The public key may be disclosed at time of proof of control. Likewise a content addressable self-certifying identifier can be derived by chaining successive one-way functions. The first is to generate the public key, the second is to create a content hash on content that includes both the document and the public key. This content addressable identifier has provable control by the associated key-pair. Given we can have secure ( provable control) content addressable identifiers why would we foster insecure content addressable identifiers? Haven’t we learned our lessons over the last three decades from the weaknesses of insecure identifiers, e.g. IP addresses?

If we are not securely identifying we are nothing.

Sent from my iPad

On Mar 25, 2020, at 00:27, Paul Knowles notifications@github.com wrote:

 @jandrieu - I think everyone understands what you are saying, what you've said and what your opinion at this juncture is. I think a definite line can be drawn there. At this stage, the argument is more philosophical than technological.

I strongly believe that a decentralized data network will be much more stable if we only have one identifier throughout the space - DIDs for everything. That would allow us to put all other identifier standards out to pasture with a graceful "thank you for getting us to this point."

I'm attaching a mini-deck so that everyone can visualise exactly what that means. DIDs for everything in the big blue circle. See first slide. Identifiers.pdf

Can we close this particular thread and all hop on #233 instead? The two threads are now talking about exactly the same philosophical issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

brentzundel commented 4 years ago

We've gotten some clarity in the spec around what DIDs may identify. Since the introduction now states that

A DID identifies any subject (e.g., a person, organization, thing, data model, abstract entity, etc.) that the controller of the DID decides that it identifies.

and

A DID document might contain the DID subject itself (e.g. a data model).

I think the only thing remaining before this can be closed is a PR that introduces a content or a subject property.

dlongley commented 4 years ago

I would think content or subject would be more like "additional/alternative representation of the subject" (some kind of name that expresses this vs. saying "the actual subject is over here, this doesn't count") since the DID Document is already expressing information about the DID subject. I'd be -1 to subject for that reason ... not sure how I feel about content (if it adequately captures the concept).

talltree commented 4 years ago

@dlongley What you say would be true except for the case where the subject of the DID is in fact a digital object (e.g., a schema, or a schema overlay object) that is contained entirely within the DID document. What Brent and I are going to propose in the PR is a standard DID doc property for this purpose. I prefer the name of this property to be subject because then it is self-evident that the digital object contained in the subject property is in fact the subject of the DID.

dlongley commented 4 years ago

@talltree,

I think it's a bit odd to say that the subject of the DID subject is this other thing. Wouldn't it be better to say that there is a digital representation X of the DID subject? Thinking in triple/sentence form:

<did:example:1234> <subject> <some_digital_value>

Or:

<did:example:1234> <representation> <some_digital_value>

The former here means that the subject, identified by did:example:1234, has a subject of some_digital_value. I think that's what is odd.

The latter here means that the subject, identified by did:example:1234 has a representation of some_digital_value. That sounds right to me -- doesn't it match what you're going for more cleanly? I'm happy for representation to be bikeshedded further in some way, but subject doesn't seem to fit the bill. Thoughts?

talltree commented 4 years ago

@dlongley I had to read your comment twice— like one of those optical illusions you have to stare at to see it another way.

image

But then I "got" it—that while the subject of the DID is a digital object, the actual markup inside the property in the DID document is a specific representation of that digital object. (Your example of how it would look as an actual RDF triple really helped.)

So I want to cook on it a little bit, but my initial thought is that representation is in fact a good name for the property.

@brentzundel and @ken-ebert (and anyone else interested in this use case), please do weigh in with your thoughts.

peacekeeper commented 4 years ago

One downside I see with representation is the following: If the resource identified by the DID is a digital object (schema, etc.), then shouldn't dereferencing that DID return only a representation of the resource, as opposed to returning a DID document that contains a representation of the resource?

Or put differently, isn't the DID document itself already a representation of the resource identified by the DID?

I'm not sure if the proposed property names capture this well enough.

I would like to offer the property name value for consideration (with the meaning of rdf:value). This way we could express cleanly that all properties in the DID document describe the subject, but here is one property that contains the "main value" of the subject.

peacekeeper commented 4 years ago

This thread also reminds me of this older comment which has left an impression on me: https://github.com/w3c/did-core/issues/65#issuecomment-558838589. Quoting from it:

We made a mistake by calling something a "DID Document". There is no such thing. There is a DID, that identifies a resource, and when you dereference it, you get a representation of that resource. It's information at that point in time... and that's all it is... and calling it a DID Document is confusing people.

brentzundel commented 4 years ago

Still waiting on a PR. @talltree is working on it .

brentzundel commented 4 years ago

This issue is related to the representation property #348, the proposed appendices #373, and #355 the alsoKnownAs and sameAs properties. We plan to discuss this today on a special topic call (relational link types).

brentzundel commented 4 years ago

We had a special topic call to discuss this. The steps to move forward are to introduce a type property that will allow a user to specify the type of the document e.g., so that a DID document can be both a DID Document and a schema. also waiting on alsoKnownAs

OR13 commented 3 years ago

related https://github.com/w3c/did-core/issues/421

msporny commented 3 years ago

The steps to move forward are to introduce a type property that will allow a user to specify the type of the document e.g., so that a DID document can be both a DID Document and a schema.

Looks like this isn't going to make it into the specification because of concerns around privacy. Language will be added to the specification noting that clarifying what DID might identify can lead to privacy dangers. it will still be possible to do so, but the specification will take a stance that for individuals, it's dangerous. We have time scheduled during the upcoming F2F to discuss this topic.

also waiting on alsoKnownAs

This looks like it's going to go through as soon as the W3C Social Web CG approves the term.

brentzundel commented 3 years ago

Since the type property has not gained consensus, I think the best option now would be to continue discussion on a representation property as a possible resolution to this issue.

rxgrant commented 3 years ago

The type idea was very dangerous to section 10.4, Herd Privacy.

https://www.w3.org/TR/did-core/#herd-privacy

I think these representational issues don't belong in DID-core at all. This spec is not trying to build a giant database of people and things. It's not meant to help a computer sort you out for easier targeting. There is no value (to users of DIDs themselves) in trying to make a database aggregation feature part of the core spec.

Use of DIDs will sometimes never reveal a DID Document publicly, but that is harder to achieve than public listings. In anticipation of the very common failure modes of data leaks, DID Documents should stick to the basics: identifier, key material, endpoint.

Take everything else to a DID method. You even have an open world model to play with, so your application can do whatever it wants within existing DID methods. Stop making this about sorting me.