w3c-ccg / did-spec

Please see README.md for latest version being developed by W3C DID WG.
https://w3c.github.io/did-core/
Other
125 stars 45 forks source link

make publicKey section more explicit for understanding of DIDs reliance on public keys #166

Closed kdenhartog closed 4 years ago

kdenhartog commented 5 years ago

In section 4.3 it states "it MUST be assumed the key has been revoked or is invalid" which isn't outlined in the rules very well and I believe is leading to the assumption that a DID can be created without a key for someone reading only the rules which states "1. A DID Document MAY include a publicKey property".

My suggestion would be to change rule 1 of section 4.3 to be:

"A DID Document MUST include a publicKey property or it MUST be assumed all past associated public keys have been revoked or are invalid"

Changing this to a MUST from a MAY makes it clear that creating a DID without a public key value is invalid and focuses the spec on entities that can cryptographically express control of a DID. While still allowing a DID to exist at a future state with no public keys because they've all been revoked.

Would removing the ability to create a DID without a public key remove any use cases which we expressly want the DID spec to address that other specs wouldn't be better suited for?

dlongley commented 5 years ago

Would removing the ability to create a DID without a public key remove any use cases which we expressly want the DID spec to address that other specs wouldn't be better suited for?

Yes. We have several use cases for DID Documents where control is delegated. We want to handle this use case via a controller property in the DID Document (#153). Such a DID Document would not contain its own keys.

Furthermore, we usually want to restrict our use of public keys (verification methods) to specific proof purposes rather than putting them all into a single publicKey bucket -- which the spec currently allows for. Requiring the use of publicKey would create a more error prone system under this design.

Lastly, there may be other mechanisms by which control over a DID could be proved without the use of a public key. We don't currently have any use cases here, but I know others have expressed interest in this area. Some have suggested that perhaps the definition of "public key" could be stretched in some way to account for other verification methods that aren't really traditional public keys -- but I think we'll be better future proofed if we don't lock anything to the absence of that property alone.

kdenhartog commented 5 years ago

I'm going to try to group your points together to explain the potential dangers I see. Hopefully I'm not conflating or making bad assumptions based on your points by doing this.

I disagree with this classification that it creates a more error prone system. I'd claim that not providing any "cryptographic proving material" (I agree with your third point) to verify control of the subject DID (DID delegating control to the Controlling DID) actually creates a more error prone system. If I have to resolve the controlling DID to verify control of a subject DID it adds resolution to the attack surface. Rather by explicitly listing the entire superset of "cryptographic proving material" that can possibly operate on the subject DID, we are preventing hazards that are created by attacking the resolution of the controlling DID.

For example, If the controlling DID (DID A) which has been granted controller status of the subject DID (DID B), and DID B grants controller status to DID A, this would cause the verifier (likely the ledger node) to enter into a recursive infinite loop and essentially DoS the node. This is just an example of a concern developers now have to account for when not listing any keys in the publicKey field because they're relying on resolution. I'd guess a clever hacker could find another more serious attack on the resolution layer allowing the attacker to take control of the subject DID.

In the case of a controller field I would suggest doing this to allow for your use cases while still explictly including keys in the publicKey field:

Subject DID Doc:

{
     "@context": "https://w3id.org/did/v1",
     "id": "DID:example:567890",
     "controller": { 
                             "controller" : "DID:example:123456#keys-1",
                             "key_id": "DID:example:567890#keys-1"
     },
     "publicKey": [{
                               "id": "did:example:567890#keys-1",
                               "type": "RsaVerificationKey2018",
                               "publicKeyPem": "-----BEGIN PUBLIC KEY...END PUBLIC KEY-----\r\n"
                            }],
      .... (rest of DID Doc unchanged from examples in spec)
}

Controller DID Doc:

{
     "@context": "https://w3id.org/did/v1",
     "id": "DID:example:123456",
     "publicKey": [{
                               "id": "did:example:123456#keys-1",
                               "type": "RsaVerificationKey2018",
                               "publicKeyPem": "-----BEGIN PUBLIC KEY...END PUBLIC KEY-----\r\n"
                            }],
      .... (rest of DID Doc unchanged from examples in spec)
}

In this case, the verifier doesn't need to resolve the controller DID Doc AND the subject DID Doc explicitly lists the superset of possible keys directly in the publicKey field.

Worth noting these examples assume controller will become a top level field. Thoughts?

dlongley commented 5 years ago

I disagree with this classification that it creates a more error prone system.

I'm referring to allowing a public key to be used for any particular purpose rather than binding it to a specific one.

If I have to resolve the controlling DID to verify control of a subject DID it adds resolution to the attack surface. Rather by explicitly listing the entire superset of "cryptographic proving material" that can possibly operate on the subject DID, we are preventing hazards that are created by attacking the resolution of the controlling DID.

For example, If the controlling DID (DID A) which has been granted controller status of the subject DID (DID B), and DID B grants controller status to DID A, this would cause the verifier (likely the ledger node) to enter into a recursive infinite loop and essentially DoS the node. This is just an example of a concern developers now have to account for when not listing any keys in the publicKey field because they're relying on resolution. I'd guess a clever hacker could find another more serious attack on the resolution layer allowing the attacker to take control of the subject DID.

This scenario does not arise from using a DID as the value for the controller property but rather from a particular verification protocol. In doesn't need to work that way. For example, the controlling DID can be used as the object capability that is referenced in a capability invocation proof -- where the ocap enables the modification of the controlled DID Document. In short, when checking the capability invocation proof, the expected invocation target is taken to be the value of the controller field in the controlled DID Document, which is checked against the controller DID. There are no cycles introduced via this top-level controller property, rather a simple equality check is performed.

That being said -- if another ledger or verification protocol wanted to allow for deeper links in some way, cycle detection is generally straightforward. Perhaps there are viable use cases for that sort of approach, but I don't think we're using it. Regardless, having a top-level controller property is added complexity (I don't think this point was ever in contention), but there are good use cases for it. I do believe it should be optional -- not every DID method would need to support it -- it is just a common way to allow for DID Documents to be controlled by other DIDs ... and it would be compatible with other potentially emerging standards such as ocap-ld.

To be clear, my comment about making a system more error prone had to do with requiring publicKey to be present when, for example, just embedding a key in the authentication field would suffice. If I only want my key to be used for authentication, this is what I would do -- and our implementation generally discourages the use of publicKey as it does not establish an authorized proof purpose relationship.

Worth noting these examples assume controller will become a top level field. Thoughts?

It is another interesting pattern that may be worth looking into -- but as noted above I think the problem you highlighted is more dependent on verification protocol and could potentially be avoided in a variety of ways.

peacekeeper commented 5 years ago

-1 to "A DID Document MUST include a publicKey property", for two reasons:

  1. While AFAIK we currently don't have DID methods that allow creation of a DID without keys, I don't believe we should make this general assumption. Perhaps in the future we will have DIDs that can be created using verification methods other than public keys.
  2. Even if a key is used to create a DID using a certain method, we should not assume that this key would actually be published in the DID Document. The key(s) used to create/update/revoke a DID may not be the same as the one(s) listed in the DID Document.
mwherman2000 commented 5 years ago

@peacekeeper Please ...let's stop overloading the term "DID" in our conversations (e.g. above points). We're converging on a very specific definition of a DID in the recent ABNF discussions:

did = "did:" method ":" method-specific-idstring
method = 1*methodchar
methodchar = %x61-7A / DIGIT
method-specific-idstring = idstring *( ":" idstring )

As such, a DID cannot have keys based on the above because a DID is an identifier. It's confuding to imply that a DID "can contain keys".

What is the name/term we want to associate with the "thing"/entity that is associated with a DID? [I've been calling this a DID Entity.]

TODO: The above comments need to be reworded/corrected.

peacekeeper commented 5 years ago

@mwherman2000 I agree it's incorrect to say "a DID can contain keys". But I don't recall actually saying that?

BTW why do you use the word "confuding" instead of "confusing"?

msporny commented 5 years ago

What is the name/term we want to associate with the "thing"/entity that is associated with a DID?

When you dereference a DID you get a DID Document.

The thing that a DID identifies is the "DID Subject".

It is not an Entity, as a DID can refer to a concept (cities, clouds, planets) as well. It's important that this is not confused w/ instances of those concepts, such as Philadelphia, the cloud over my house right now, or Jupiter. In all cases, though, DIDs refer to subjects.

mwherman2000 commented 5 years ago

@mwherman2000 I agree it's incorrect to say "a DID can contain keys". But I don't recall actually saying that?

@peacekeepr: It was implied in phrases in https://github.com/w3c-ccg/did-spec/issues/166#issuecomment-463109027:

  1. we currently don't have DID methods that allow creation of a DID without keys and here:
  2. Even if a key is used to create a DID using a certain method...
mwherman2000 commented 5 years ago

What is the name/term we want to associate with the "thing"/entity that is associated with a DID?

@msporny Rephrasing: what is the thing/term/concept that we want to use when we want to say or ask:

...because we can't use the term DID.

talltree commented 5 years ago

On Sun, Feb 17, 2019 at 10:42 AM Michael Herman (Toronto) < notifications@github.com> wrote:

What is the name/term we want to associate with the "thing"/entity that is associated with a DID?

@msporny https://github.com/msporny Rephrasing: what is the thing/term/concept that we want to use when we want to say or ask:

  • "__ contains keys (public keys), and
  • "does __ has keys (public keys), respectively?

...because we can't use the term DID.

The term is "DID document". A DID resolves to a DID document and it is the DID document that contains public keys, service endpoints, authentication methods, and any other metadata describing the DID subject.

peacekeeper commented 5 years ago

@mwherman2000 I agree it's incorrect to say "a DID can contain keys". But I don't recall actually saying that?

@peacekeepr: It was implied in phrases in #166 (comment):

  1. we currently don't have DID methods that allow creation of a DID without keys

The "without keys" part of my sentence was meant to refer to the word "creation", not the word "DID". Sorry if I wasn't clear enough.

and here:

  1. Even if a key is used to create a DID using a certain method...

Sorry but I don't see how this sentence implies a statement that a DID itself "can contain keys".

As a side note, remember that there are proposed DID methods where the DID itself (the identifier) DOES contain a key, but that's a separate issue. E.g. see here and here.

jandrieu commented 5 years ago

@msporny wrote

When you dereference a DID you get a DID Document.

This is not entirely correct. When you resolve a DID you get a DID Document.

When you dereference a DID, you may get the DID Document, a portion of a DID document (by current proposals), or the resource at the end of a service endpoint (if the DID contains a service component). The first step in dereferencing is to resolve the DID, which then gives you the meta-data you need to actually dereference to get the intended resource. I do think that the default dereference (assuming no additional relative component like a fragment or service) should return the full DID document. That feels like it has the smell of consensus (most of us seem to talk as if this is the case). What we don't have clarity on is when you might return a subset of the document (imo, when there is a path part or a fragment and NO query/service component) and when dereferencing the DID means following the service endpoint (in my proposal that's what you do when the DID has a ?query component).

There were some big "ah-hah!" moments for @burnburn and I when reading RFC3986 in the context of the DID explainer. Resolution is not dereferencing and yet we regularly conflate those two in this community.

talltree commented 5 years ago

+1, Joe. I think in the revised DID spec we should explain the difference between "resolving" and "dereferencing" right up front.

I too have been spending a bunch of time with RFC 3986 as we work revising the ABNF for DIDs and DID references. The ABNF below (slightly revised from what I posted last night) is now more closely aligned with the ABNF in Appendix A of RFC 3986 https://www.ietf.org/rfc/rfc3986.txt—in fact all the ABNF rules not defined below are defined in 3986, which is why this can be so short.

Based on this alignment, the terminology lines up nicely: you can only resolve a DID ("did" rule below) to return a DID document, and you can only dereference a DID reference ("did-reference" rule below) to return whatever resource is referenced.

did = "did:" method ":" method-specific-idstring method = 1methodchar methodchar = %x61-7A / DIGIT method-specific-idstring = idstring ( ":" idstring ) idstring = 1*idchar idchar = ALPHA / DIGIT / "." / "-"

absolute-did = did [ did-relative-ref ]

did-relative-ref = did-fragment-ref / did-service-ref

did-fragment-ref = "#" fragment

did-service-ref = *( ";" service-id ) [ path-abemtpy ] [ "?" query ] [ "#" fragment ]

service-id = 1*( ALPHA / DIGIT / "." / "-" / "_" / pct-encoded )

did-reference = absolute-did / did-relative-ref

On Sun, Feb 17, 2019 at 12:09 PM Joe Andrieu notifications@github.com wrote:

@msporny https://github.com/msporny wrote

When you dereference a DID you get a DID Document.

This is not entirely correct. When you resolve a DID you get a DID Document.

When you dereference a DID, you may get the DID Document, a portion of a DID document (by current proposals), or the resource at the end of a service endpoint (if the DID contains a service component). The first step in dereferencing is to resolve the DID, which then gives you the meta-data you need to actually dereference to get the intended resource. I do think that the default dereference (assuming no additional relative component like a fragment or service) should return the full DID document. That feels like it has the smell of consensus (most of us seem to talk as if this is the case). What we don't have clarity on is when you might return a subset of the document (imo, when there is a path part or a fragment and NO query/service component) and when dereferencing the DID means following the service endpoint (in my proposal that's what you do when the DID has a ?query component).

There were some big "ah-hah!" moments for @burnburn https://github.com/burnburn and I when reading RFC3986 in the context of the DID explainer. Resolution is not dereferencing and yet we regularly conflate those two in this community.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c-ccg/did-spec/issues/166#issuecomment-464502719, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTd3GRF8e6ZBek_Wsl0W4zVYQwVPmks5vObcIgaJpZM4akXHn .

dhh1128 commented 5 years ago

The thing that a DID identifies is the "DID Subject".

It is not an Entity, as a DID can refer to a concept (cities, clouds, planets) as well. It's important that this is not confused w/ instances of those concepts, such as Philadelphia, the cloud over my house right now, or Jupiter. In all cases, though, DIDs refer to subjects.

@msporny How does this jive with section 6.2 of the current spec, which says:

The DID subject is the identifier that the DID Document is about, i.e., it
is the DID described by DID Document. The rules for a DID subject are:

A DID Document MUST have exactly one DID subject.
The key for this property MUST be id.
The value of this key MUST be a valid DID.
When this DID Document is registered with the target distributed ledger or
network, the registered DID MUST match this DID subject value."
msporny commented 5 years ago

@dhh1128 taking it line by line as some of the statements are problematic:

A DID Document MUST have exactly one DID subject.

This is correct. Doing anything else would be problematic for a variety of reasons that have been rehashed over the past 18+ years (lack of syntax support for talking about multiple subjects simultaneously across almost all Linked Data syntaxes -- that particular feature has been considered multiple times and struck down multiple times).

The key for this property MUST be id.

Yep, also good. Linked Data syntaxes don't require this, but the fact that we have people that can't agree on JSON vs. JSON-LD forces us to pick the name of the property.

The value of this key MUST be a valid DID.

Yep, good. This is good spec hygiene -- the value must match the ABNF, basically.

When this DID Document is registered with the target distributed ledger or network, the registered DID MUST match this DID subject value.

The only thing that's a bit confusing here is "DID Subject value". Not quite clear what that means in this context. We could probably strike the entire sentence as I don't think it's necessary.

Did that provide clarity @dhh1128, or were you concerned about something else?

dhh1128 commented 5 years ago

@msporny I'm concerned about the following line from the spec: "The DID subject is the identifier that the DID Document is about, i.e., it is the DID described by DID Document" -- which seems directly contradictory to your statement in this comment stream that "The thing that a DID identifies is the DID Subject".

Assume the following inter-related items: Alice, the DID "did:example:abcxyz" (controlled by Alice), and a DID doc for "did:example:abcxyz".

The first sentence says that the DID subject = "did:example:abcxyz". The second sentence says that the DID subject = Alice. (You gave examples of a DID referencing both instances and concepts.)

So which is it?

talltree commented 5 years ago

Daniel, I think I see what you are getting at, which is this exact set of words: "The DID subject is the identifier that the DID Document is about, i.e., it is the DID described by DID Document."

I think that sentence should be revised to, "The DID Subject is the unique resource identified by the DID and described by the DID Document."

So, given Alice, the DID "did:example:abcxyz" identifying Alice, and a DID doc for "did:example:abcxyz" that describes Alice, then Alice is the DID Subject.

On Mon, Mar 11, 2019 at 2:08 PM Daniel Hardman notifications@github.com wrote:

@msporny https://github.com/msporny I'm concerned about the following line from the spec: "The DID subject is the identifier that the DID Document is about, i.e., it is the DID described by DID Document" -- which seems directly contradictory to your statement in this comment stream that "The thing that a DID identifies is the DID Subject".

Assume the following items: Alice, the DID "did:example:abcxyz", and a DID doc for "did:example:abcxyz".

The first sentence says that the DID subject = "did:example:abcxyz". The second sentence says that the DID subject = Alice. (You gave examples of a DID referencing both instances and concepts.)

So which is it?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c-ccg/did-spec/issues/166#issuecomment-471705162, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTaE39mic_P9MujHH1VndwDvnuJieks5vVrfPgaJpZM4akXHn .

msporny commented 5 years ago

To language lawyer a bit more, let's strike "unique" as using that word may take us into HTTP Range 14 territory (a debate that has been going on for over 18+ years at W3C, with no end in sight). :P

The DID Subject is the resource identified by the DID and described by the DID Document.

Does the above wording work for you @dhh1128 and @talltree? If so, we can make the change in the spec.

dhh1128 commented 5 years ago

Either proposed sentence (from @talltree or @msporny) is IMO an improvement in clarity.

However, I am still not sure that Alice is what's being described by a DID document. A DID Doc is metadata about an identifier; it doesn't really describe Alice herself. It's not a "DID Subject Doc." N'est-ce pas?

I also don't like the web-ism of describing a person, organization, or thing as a "resource". URIs do fetch "resources" in traditional web lingo--especially in RESTful http. And de-referencing a DID reference is likely to access a resource. But that resource isn't the DID Subject; it's a resource that the DID Subject controls. The word "resource" describes a passive thing that is accessed or acted upon, not a thing that can take independent action in contexts beyond DID-land. Not all DID Subjects are guaranteed to be active, but the most prominent examples of DID Subjects certainly are not just things-to-be-acted-upon.

What about this verbiage:

The DID Subject is the thing identified by the DID. In DIDs used for self-sovereign identity, this might mean the DID Subject is a person, organization, or IoT thing.

mwherman2000 commented 5 years ago

IoT Thing

@dhh1128 I don't believe Thing needs to be qualified as an IoT Thing. I don't believe the term "IoT" needs to appear anywhere in the did-spec.

For example, in the Sovrin Glossary, "People, Organizations, and Things" is used consistently ...without ever being qualified with "IoT". For example,

Identity The capability to distinguish a specific Entity from all others in a specific context. Identity may apply to any type of Entity, including Individuals, Organizations, and Things. Note that Legal Identity is only one form of Identity. Many technologies can provide Identity capabilities; the Sovrin Governance Framework defines one such system, the Sovrin Network.

peacekeeper commented 5 years ago

However, I am still not sure that Alice is what's being described by a DID document. A DID Doc is metadata about an identifier; it doesn't really describe Alice herself. It's not a "DID Subject Doc." N'est-ce pas?

@dhh1128 , I don't think it's possible in the RDF data model to describe metadata about an identifier. You always describe the resource/thing that the identifier identifies (after all, that's the whole purpose of identifiers). So, the DID Document does in fact describe Alice.

As for the relationship between real-world things and their digital representations, and regarding the question of what it means to "dereference" an identifier, I think it's helpful to read about:

talltree commented 5 years ago

Great conversation. To quickly respond to Manu's original question directed at me, he asked if I was satisfied with:

The DID Subject is the resource identified by the DID and described by the DID Document.

Answer = yes. I'd propose that we go ahead and make that editorial change in the spec.

=D

On Wed, Mar 13, 2019 at 4:34 PM Markus Sabadello notifications@github.com wrote:

However, I am still not sure that Alice is what's being described by a DID document. A DID Doc is metadata about an identifier; it doesn't really describe Alice herself. It's not a "DID Subject Doc." N'est-ce pas?

Daniel, I don't think it's possible in the RDF data model to describe metadata about an identifier. You always describe the resource/thing that the identifier identifies (after all, that's the whole purpose of identifiers). So, the DID Document does in fact describe Alice.

As for the relationship between real-world things and their digital representations, and regarding the question of what it means to "dereference" an identifier, I think it's helpful to read about:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c-ccg/did-spec/issues/166#issuecomment-472632831, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTRdEv9H2_0yHkSVTrq46NAASxndIks5vWX0IgaJpZM4akXHn .

jandrieu commented 5 years ago

I think this is only half correct. and the error, I believe boils down to the question of whether or not DIDs are URIs or URLs.

The DID is an identifier which refers to a subject. The DID Document contains the metadata for accessing services related to the subject.

The question is whether or not the Subject is the Resource.

Per RFC3986:

URI "resolution" is the process of determining an access mechanism and the appropriate parameters necessary to dereference a URI; this resolution may require several iterations. To use that access mechanism to perform an action on the URI's resource is to "dereference" the URI.

Resolving a DID to the appropriate DID Document is how you acquire the meta-data of "appropriate parameters necessary to dereference" the DID.

If DIDs are URLs, then the resource is the thing dereferenced, either the DID Document itself or the resource found at a service endpoint (when a service name is contained in the DID-URI).

From RFC3986:

A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location").

HOWEVER, if DIDs are URIs, then the resource is the thing identified, from RFC3986:

Resource This specification does not limit the scope of what might be a resource; rather, the term "resource" is used in a general sense for whatever might be identified by a URI. Familiar examples include an electronic document, an image, a source of information with a consistent purpose (e.g., "today's weather report for Los Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a collection of other resources. A resource is not necessarily accessible via the Internet; e.g., human beings, corporations, and bound books in a library can also be resources. Likewise, abstract concepts can be resources, such as the operators and operands of a mathematical equation, the types of a relationship (e.g., "parent" or "employee"), or numeric values (e.g., zero, one, and infinity).

I agree that what a DID identifies can reasonably be described as the Subject, but the DID Document doesn't describe the Subject, it describes various means of interacting with the Subject, either through use of keys or service endpoints.

Which means, at best, we have three different interpretations of what the resource is.

When we treat the DID as a URI, the Resource is the subject. If we treat it as a URL, the Resource is either the DID Document found when resolving the DID (returned by default), or the resource found at a dereferenced service endpoint.

Or more succinctly, the Subject is the resource referred to by a DID. The DID Document is the resource resolved by a DID and dereferenced by default. A service is the Resource dereferenced by a DID with a service component.

In any case, the DID Document does not describe the subject. Framing it that way will lead to compromised privacy when people start putting PII in publicly retrievable DID Documents.

Also... when you dereference a DID, you never get the Subject, you get some resource associated with the Subject, further confusing the question of what the "Resource" is.

talltree commented 5 years ago

Joe, I think the DID Resolution folks would disagree with the specific words you are using about resolution and dereferencing. But with regard to the phrase "the DID document describes the DID Subject", I meant "description" very generally, in the RDF (Resource Description Framework) sense. I agree it we should defined more narrowly, i.e., in the dense that a DNS record obtained by resolving a DNS name describes the resource identified by a DNS name.

I'd go deeper but I have to leave for a conference.

On Thu, Mar 14, 2019 at 7:46 AM Joe Andrieu notifications@github.com wrote:

I think this is only half correct. and the error, I believe boils down to the question of whether or not DIDs are URIs or URLs.

The DID is an identifier which refers to a subject. The DID Document contains the metadata for accessing services related to the subject.

The question is whether or not the Subject is the Resource.

Per RFC3986:

URI "resolution" is the process of determining an access mechanism and the appropriate parameters necessary to dereference a URI; this resolution may require several iterations. To use that access mechanism to perform an action on the URI's resource is to "dereference" the URI.

Resolving a DID to the appropriate DID Document is how you acquire the meta-data of "appropriate parameters necessary to dereference" the DID.

If DIDs are URLs, then the resource is the thing dereferenced, either the DID Document itself or the resource found at a service endpoint (when a service name is contained in the DID-URI).

From RFC3986:

A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location").

HOWEVER, if DIDs are URIs, then the resource is the thing identified, from RFC3986:

Resource This specification does not limit the scope of what might be a resource; rather, the term "resource" is used in a general sense for whatever might be identified by a URI. Familiar examples include an electronic document, an image, a source of information with a consistent purpose (e.g., "today's weather report for Los Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a collection of other resources. A resource is not necessarily accessible via the Internet; e.g., human beings, corporations, and bound books in a library can also be resources. Likewise, abstract concepts can be resources, such as the operators and operands of a mathematical equation, the types of a relationship (e.g., "parent" or "employee"), or numeric values (e.g., zero, one, and infinity).

I agree that what a DID identifies can reasonably be described as the Subject, but the DID Document doesn't describe the Subject, it describes various means of interacting with the Subject, either through use of keys or service endpoints.

Which means, at best, we have three different interpretations of what the resource is.

When we treat the DID as a URI, the Resource is the subject. If we treat it as a URL, the Resource is either the DID Document found when resolving the DID (returned by default), or the resource found at a dereferenced service endpoint.

Or more succinctly, the Subject is the resource referred to by a DID. The DID Document is the resource resolved by a DID and dereferenced by default. A service is the Resource dereferenced by a DID with a service component.

In any case, the DID Document does not describe the subject. Framing it that way will lead to compromised privacy when people start putting PII in publicly retrievable DID Documents.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c-ccg/did-spec/issues/166#issuecomment-472896141, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTUVgbmLx9HUVcLLIL97N57zJ3A97ks5vWmDTgaJpZM4akXHn .

jandrieu commented 5 years ago

@talltree Please unpack it here. This spec is going to a vote asap. My use is based directly on RFC3986. If I'm misunderstanding it, I doubt I'll be the only one.

My point remains: the DID Document is not describing the subject, but rather describing ways to interact with services related to that subject, including the inferred services of cryptographic verification and authentication using the keys.

There are implementations putting information about subjects in the DID Document, often with the argument that this information is already public, so there's no privacy impact. Regardless of the argument for conflating the meta-data for derferencing with descriptive data about the subject, we should use language that makes it clear the DID Document is about how you interact with the subject, and not about the Subject. In particular, it should NOT be advocated as a place for generic RDF statements about the subject.

ChristopherA commented 5 years ago

On Thu, Mar 14, 2019 at 7:46 AM Joe Andrieu notifications@github.com wrote:

Or more succinctly, the Subject is the resource referred to by a DID. The DID Document is the resource resolved by a DID and dereferenced by default. A service is the Resource dereferenced by a DID with a service component.

In any case, the DID Document does not describe the subject. Framing it that way will lead to compromised privacy when people start putting PII in publicly retrievable DID Documents.

On Thu, Mar 14, 2019 at 9:06 AM Joe Andrieu notifications@github.com wrote:

My point remains: the DID Document is not describing the subject, but rather describing ways to interact with services related to that subject, including the inferred services of cryptographic verification and authentication using the keys.

There are implementations putting information about subjects in the DID Document, often with the argument that this information is already public, so there's no privacy impact. Regardless of the argument for conflating the meta-data for derferencing with descriptive data about the subject, we should use language that makes it clear the DID Document is about how you interact with the subject, and not about the Subject. In particular, it should NOT be advocated as a place for generic RDF statements about the subject.

+1 to this framing.

I think it is important here that we be MUCH MORE CLEAR that you should not put information about the entity that the DID document facilitates access to. It should only be used to get proof materials and services that allow you to get secure access to information about that entity.

This will also help address the constant complaint that we get from the larger community that was are putting personal information on the blockchain. If we are more clear that both the DID, and the DID Document it resolves to, should never contain personal information, that will address some of the problem.

-- Christopher Allen

talltree commented 5 years ago

Ok, since we have strong agreement about the purpose of a DID document, how about this language:

The DID Subject is the resource identified by the DID. The DID can be

resolved to a DID Document. The purpose of a DID document is to provide metadata describing how to securely and privately interact with the DID Subject. In the cases where the DID document is in publicly available verifiable data registry, it is very important that the DID document not contain any data that would compromise the privacy of the DID Subject.

Thoughts?

On Thu, Mar 14, 2019 at 10:08 AM Christopher Allen notifications@github.com wrote:

On Thu, Mar 14, 2019 at 7:46 AM Joe Andrieu notifications@github.com wrote:

Or more succinctly, the Subject is the resource referred to by a DID. The DID Document is the resource resolved by a DID and dereferenced by default. A service is the Resource dereferenced by a DID with a service component.

In any case, the DID Document does not describe the subject. Framing it that way will lead to compromised privacy when people start putting PII in publicly retrievable DID Documents.

On Thu, Mar 14, 2019 at 9:06 AM Joe Andrieu notifications@github.com wrote:

My point remains: the DID Document is not describing the subject, but rather describing ways to interact with services related to that subject, including the inferred services of cryptographic verification and authentication using the keys.

There are implementations putting information about subjects in the DID Document, often with the argument that this information is already public, so there's no privacy impact. Regardless of the argument for conflating the meta-data for derferencing with descriptive data about the subject, we should use language that makes it clear the DID Document is about how you interact with the subject, and not about the Subject. In particular, it should NOT be advocated as a place for generic RDF statements about the subject.

+1 to this framing.

I think it is important here that we be MUCH MORE CLEAR that you should not put information about the entity that the DID document facilitates access to. It should only be used to get proof materials and services that allow you to get secure access to information about that entity.

This will also help address the constant complaint that we get from the larger community that was are putting personal information on the blockchain. If we are more clear that both the DID, and the DID Document it resolves to, should never contain personal information, that will address some of the problem.

-- Christopher Allen

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c-ccg/did-spec/issues/166#issuecomment-472967594, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTQ4UR7KI-Fp8hqqBq0oThSIKUjV5ks5vWoINgaJpZM4akXHn .

mwherman2000 commented 5 years ago

In the cases where the DID document is in publicly available verifiable data registry, it is very important that the DID document not contain any data that would compromise the privacy of the DID Subject.

can be simplified by removing the reference to a VDR...

In the cases where the DID document is publicly available, it is very important that the DID document not contain any data that would compromise the privacy of the DID Subject.

talltree commented 5 years ago

+1

On Thu, Mar 14, 2019 at 10:36 AM Michael Herman (Toronto) < notifications@github.com> wrote:

In the cases where the DID document is in publicly available verifiable data registry, it is very important that the DID document not contain any data that would compromise the privacy of the DID Subject.

can be simplified by removing the reference to a VDR...

In the cases where the DID document is publicly available, it is very important that the DID document not contain any data that would compromise the privacy of the DID Subject.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c-ccg/did-spec/issues/166#issuecomment-472980368, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTa8hT-i3tAS4KLEz33HsU3_O-2EQks5vWoingaJpZM4akXHn .

jandrieu commented 5 years ago

+1 This last rev looks good.

ChristopherA commented 5 years ago

Drummond,

Could you share again the latest version here? I'm not sure what I'm +1'ing anymore.

-- Christopher Allen

jandrieu commented 5 years ago

Here's what I +1'd:

The DID Subject is the resource identified by the DID. The DID can be resolved to a DID Document. The purpose of a DID document is to provide metadata describing how to securely and privately interact with the DID Subject. In the cases where the DID document is publicly available, it is important that the DID document not contain any data that would compromise the privacy of the DID Subject.

talltree commented 5 years ago

+1 to what Joe’s +1

On Thu, Mar 14, 2019 at 3:31 PM Joe Andrieu notifications@github.com wrote:

Here's what I +1'd:

The DID Subject is the resource identified by the DID. The DID can be resolved to a DID Document. The purpose of a DID document is to provide metadata describing how to securely and privately interact with the DID Subject. In the cases where the DID document is publicly available, it is important that the DID document not contain any data that would compromise the privacy of the DID Subject.

You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c-ccg/did-spec/issues/166#issuecomment-473090861, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTVQeTK1ZZt0-THo-jXet91dKPA6Tks5vWs2ngaJpZM4akXHn .

mwherman2000 commented 5 years ago

+1 ^ 3 ...d*mn ;-) ...that's still a +1 :-)

ChristopherA commented 5 years ago

+1

My only bikeshed is to point out this is in fact the DID's rdf:subject ("rdf:subject used to state the subject of a statement"), not an individual, as some people may not understand that Subject is a very specific keyword word in the RDF/JSON-LD space. Are we presuming that anyone who gets to this point in the document understand the special meaning of Subject?

jandrieu commented 5 years ago

I brought this up long ago on the VC conversation. We do not mean subject in the rdf sense, that is, as the first element in an rdf triple. We mean it as the referent to which we refer. I think more people will be confused by referring to people as resources, since you can't directly interact with a human resource through a DID. You can refer to a human, using it as a URI, but used as URL, you are interacting with a resolver or a service endpoint, not the human directly.

kdenhartog commented 5 years ago

Given the comments that have streamed from this discussion I better understand the context of why requiring keys is overly scoped because in many different use cases key management is not necessary. I feel satisfied in terms of understanding in how we're moving forward and would like to move forward with closing out this issue. What are the next steps we need to take given what appears to be consensus achieved on this subject?

jandrieu commented 4 years ago

Closing as this has been moved to the DIDWG repo.