w3c / vc-data-model

W3C Verifiable Credentials v2.0 Specification
https://w3c.github.io/vc-data-model/
Other
287 stars 105 forks source link

Profile ID is ambiguous and potentially improper #133

Closed jandrieu closed 6 years ago

jandrieu commented 6 years ago

In @dlongley 's comment on #55, he asserts that the profile ID is the ID of the Holder: https://github.com/w3c/vc-data-model/issues/55#issuecomment-368687003

The "profile" is just some statements about the holder. We expect profiles to be fairly ephemeral documents that just make use of the "default graph". But I suppose if someone wanted to reference a particular profile they'd use the "graph" position in an RDF quad, i.e. you'd say "this is what was said about the holder in a particular document/graph named profile id.

The problem with this approach is that for essentially all other web data models I'm familiar with, the ID field within a data entity defines the identifier for that data entity, NOT an identifier for some other entity (such as a holder).

Not only does this create problems by preventing anyone from explicitly referring to a profile (it has no unique identifier), it ALSO invites incorrect interpretations of what the ID field means. I'd bet good money that if you put the following JSON in front of 100 web developers:

EXAMPLE 17: A simple verifiable profile in JSON-LD Format
{
  "@context": [
    "http://schema.org",
    "https://w3id.org/credentials/v1"
  ],
  "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
  "credential": [{
    "id": "http://dmv.example.gov/credentials/3732",
    "type": ["Credential", "ProofOfAgeCredential"],
    "issuer": "https://dmv.example.gov/issuers/14",
    "issued": "2010-01-01T19:73:24Z",
    "claim": {
      "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
      "ageOver": 21
    },
    "proof": {
      "type": "RsaSignature2018",
      "created": "2017-06-17T10:03:48Z",
      "creator": "https://dmv.example.gov/issuers/14/keys/234",
      "nonce": "d61c4599-0cc2-4479-9efc-c63add3a43b2",
      "signatureValue": "pYw8XNi1bgVg/sCneO4BavEll0/I1zJugez8RwDg/+
      ibcC1wpsMCRVpjOboDoe4SxxKjkCOvKiCHGDvc4krqi6Z1n0UfqzxGfmatCuF
      zvueMWmFPRdW+gGsutPTLhwYmfIFpbBu95t501+rSLHIEuujM/+PXr+W3JT24
      9Cky6Ed="
    }
  }],
  "proof": [{
    "type": "RsaSignature2018",
    "created": "2017-06-18T21:19:10Z",
    "creator": "did:example:ebfeb1f712ebc6f1c276e12ec21/keys/2",
    "nonce": "c0ae1c8e-c7e7-469f-b252-86e6a0e7387e",
    "signatureValue": "BavEll0/I1zpYw8XNi1bgVg/sCneO4Jugez8RwDg/+
    MCRVpjOboDoe4SxxKjkCOvKiCHGDvc4krqi6Z1n0UfqzxGfmatCuFibcC1wps
    PRdW+gGsutPTLzvueMWmFhwYmfIFpbBu95t501+rSLHIEuujM/+PXr9Cky6Ed
    +W3JT24="
  }]
}

Then ask those developers what the identifier of the profile is, an overwhelming majority will say "did:example:ebfeb1f712ebc6f1c276e12ec21". Unfortunately, THAT is the id of the holder (by current usage). The conclusion that "did:example:ebfeb1f712ebc6f1c276e12ec21" is the identifier of the profile is WRONG. The correct answer is that the profile has no identifier, rather its "id" property is the identifier of the holder.

This is broken.

My recommendation is to adjust the data model to the following by adding a holder predicate:

EXAMPLE 17: A simple verifiable profile in JSON-LD Format
{
  "@context": [
    "http://schema.org",
    "https://w3id.org/credentials/v1"
  ],
  "id": "did:example:abc31f31aecc6611278e1a44d7",
  "holder": "did:example:ebfeb1f712ebc6f1c276e12ec21",
  "credential": [{
    "id": "http://dmv.example.gov/credentials/3732",
    "type": ["Credential", "ProofOfAgeCredential"],
    "issuer": "https://dmv.example.gov/issuers/14",
    "issued": "2010-01-01T19:73:24Z",
    "claim": {
      "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
      "ageOver": 21
    },
    "proof": {
      "type": "RsaSignature2018",
      "created": "2017-06-17T10:03:48Z",
      "creator": "https://dmv.example.gov/issuers/14/keys/234",
      "nonce": "d61c4599-0cc2-4479-9efc-c63add3a43b2",
      "signatureValue": "pYw8XNi1bgVg/sCneO4BavEll0/I1zJugez8RwDg/+
      ibcC1wpsMCRVpjOboDoe4SxxKjkCOvKiCHGDvc4krqi6Z1n0UfqzxGfmatCuF
      zvueMWmFPRdW+gGsutPTLhwYmfIFpbBu95t501+rSLHIEuujM/+PXr+W3JT24
      9Cky6Ed="
    }
  }],
  "proof": [{
    "type": "RsaSignature2018",
    "created": "2017-06-18T21:19:10Z",
    "creator": "did:example:ebfeb1f712ebc6f1c276e12ec21/keys/2",
    "nonce": "c0ae1c8e-c7e7-469f-b252-86e6a0e7387e",
    "signatureValue": "BavEll0/I1zpYw8XNi1bgVg/sCneO4Jugez8RwDg/+
    MCRVpjOboDoe4SxxKjkCOvKiCHGDvc4krqi6Z1n0UfqzxGfmatCuFibcC1wps
    PRdW+gGsutPTLzvueMWmFhwYmfIFpbBu95t501+rSLHIEuujM/+PXr9Cky6Ed
    +W3JT24="
  }]
}

The semantic statement this adds can be stated in turtle as did:example:abc31f31aecc6611278e1a44d7 didns:holder did:example:ebfeb1f712ebc6f1c276e12ec21 . (using didns: as the namespace for DID predicates)

Or "did:example:abc31f31aecc6611278e1a44d7" has a "holder" referred to as "did:example:ebfeb1f712ebc6f1c276e12ec21".

This modest change gives the profile a clear and unambiguous identifier "did:example:abc31f31aecc6611278e1a44d7" and clearly and unambiguously states that the holder of this profile (who, by definition is expected to be the presenter) is "did:example:ebfeb1f712ebc6f1c276e12ec21". With inspection, you can also see that the holder is also the subject of the claim, with unambiguous interpretation.

This also supports the lost&found profile use case where the digital profile is "found" in transit or storage by someone who is NOT the holder. Now, the profile can be unambiguously discussed, transmitted, even submitted without implying that the current presenter (who found the original profile) claims to be the "holder". Instead, it can be attached to an email or whatever and used/referred to/reasoned over/submitted to a court of law, etc., WITHOUT the implied assertion that the current presenter is the holder.

coderintherye commented 6 years ago

Just want to add agreement here for being explicit about holder vs. id. I agree that we would likely confuse "id" as being the unique identifier of the claim rather than the holder. I might suggest "holder_id" as being more explicit, but either way having another predicate for this makes sense.

David-Chadwick commented 6 years ago

Joe I am tending to agree with you, but if you ask any developer what is "id": "did:example:ebfeb1f712ebc6f1c276e12ec21", they will say it is the claim id, not the subject id. So this issue is related to issue #120. As I am producing a PR for the latter I can include this in the PR as well

dlongley commented 6 years ago

@David-Chadwick,

As I am producing a PR for the latter I can include this in the PR as well

I recommend we keep the issues separate. We don't have consensus here yet whereas in the other issue, I suspect we do. It would be better not to hold up the other PR.

dlongley commented 6 years ago

@jandrieu,

The problem with this approach is that for essentially all other web data models I'm familiar with, the ID field within a data entity defines the identifier for that data entity, NOT an identifier for some other entity (such as a holder).

The current design follows an open world assumption. Presently, a Profile is just some set of statements about a holder -- which is why it makes perfect sense for the ID to refer to the holder. One of those statements is essentially "This holder holds credentials X, Y, and Z". This is what makes a Profile the way to bundle credentials -- they are "bundled" according to the fact that they are all held by the holder.

We can explore deviating from this design in a number of ways -- but I do think it is currently consistent with the rest of the data model. We could, for example, convert Profiles into more than just several open world statements about a particular holder into its own "data entity". It was not this previously -- which is where the confusion seems to stem from. In order to do this and to be consistent, we should take a similar approach to what we did with credentials:

{
   "id": "<id of the Profile>",
   "holder": {
     "id": "<id of the holder>",
     "verifiableCredential": [/*... verifiable credentials held by the holder ...*/],
     // ... any other statements about the holder
   },
   // ... any other statements about the Profile itself
}

As you can see, currently a "Profile" is merely the object that "holder" points to. We can change its meaning to be some "data entity" that contains this relationship predicate of "holder" instead, if it's desirable to have such a data entity and predicate. The alternative to this is for systems to store individual statements about holders as separate graphs -- which has the same effect -- and keep things the same. The only real change here is:

  1. Whether or not we introduce a predicate like "holder" to refer to the graph that is presently understood to be "Profile".
  2. Change the meaning of "Profile" to be a graph with the predicate "holder" that refers to another graph that would now have no name in the data model.
gkellogg commented 6 years ago

There seems to be enough ambiguity over what the subject of a Profile is, that it probably requires further description.

@David-Chadwick made the following assertion:

The problem with this approach is that for essentially all other web data models I'm familiar with, the ID field within a data entity defines the identifier for that data entity, NOT an identifier for some other entity (such as a holder).

I agree with the sentiment that the id field (RDF subject) of the node object should identify the entity about which properties relate, not something else.

@dlongley said:

Presently, a Profile is just some set of statements about a holder -- which is why it makes perfect sense for the ID to refer to the holder.

Absolutely correct, IMHO. If the statements are about the holder then the RDF subject (id) should identify the holder.

David-Chadwick commented 6 years ago

On 21/03/2018 15:13, Dave Longley wrote:

@David-Chadwick https://github.com/david-chadwick,

As I am producing a PR for the latter I can include this in the PR
as well

I recommend we keep the issues separate. We don't have consensus here yet whereas in the other issue, I suspect we do. It would be better not to hold up the other PR.

As I am on a skiing holiday this week you have a few more days to resolve this issue before I produce the PR at the start of next week

regards

David

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/vc-data-model/issues/133#issuecomment-374973318, or mute the thread https://github.com/notifications/unsubscribe-auth/ADe4_3NyU6wuOk9yHqvraxhbWGzwKx-kks5tgm4FgaJpZM4SyX1l.

stonematt commented 6 years ago

@David-Chadwick in which PR is this addressed?

msporny commented 6 years ago

I assert that we fixed this issue in PR https://github.com/w3c/vc-data-model/pull/191 ... @jandrieu do you agree, because if so, we should close this issue.

msporny commented 6 years ago

ACTION: @msporny to bring this up with @jandrieu at RWoT7, we have addressed the issue by adding an identifier that is a specific profile/presentation identifier.