w3c / vc-data-model

W3C Verifiable Credentials v2.0 Specification
https://w3c.github.io/vc-data-model/
Other
287 stars 101 forks source link

more clarity around the `id` field in the VC data model #973

Closed andorsk closed 1 year ago

andorsk commented 1 year ago

The example https://www.w3.org/TR/vc-data-model/#identifiers gives the following description for id:

The value of the id property MUST be a single URI. It is RECOMMENDED that the URI in the id be one which, if dereferenced, results in a document containing machine-readable information about the id.

with an example http://example.edu/credentials/3732 which doesn't dereference into anything meaningful.

As an implementer, I'm struggling to figure out exactly what type of data the ID field should dereference to and what the recommendation is on it.

It would be helpful to provide more clarity on the data model about:

  1. What type of data should the dereferenced id generally contain.
  2. A working example of a dereferenced id

I would be happy to raise a PR on this, given some direction.

bumblefudge commented 1 year ago

Note also this example of a URN (in this case a UUID) rather than a URL id prop. If you really wanted to get 🌶️ spicy 🌶️ you could even use an IPFS CID for that identifier, whether as a URN or as an ipfs:// URL , although that wouldn't really answer your question of what that [content-addressed] id should derefence to

andorsk commented 1 year ago

yea..thanks @bumblefudge. To your point, if it just said: provide an ID for the VC, I wouldn't have raised this issue.

My 🌶️ take is that an id field makes sense, but the id field being used as a descriptor of the document to me makes less sense 🙇 . I would almost think you need to break this out into two fields:

  1. an id which is just a unique identifier. with a recommendation that the id be a did. ( after all, they have known referable methods ). But it could be a UUID for example.
  2. an optional description field: which contains information about the VC itself. With the option to either reference the description ( as a did or url ), or put the description directly as a string ( why not allow that? ).

I don't know. Would love to hear if this is a reasonable position or I'm over thinking this.

melvincarvalho commented 1 year ago

I may be wrong here, but from a simple reading of the text: In the example given there is some JSON and an HTTP URI

HTTP URI: http://example.edu/credentials/3732

json:

{
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://www.w3.org/2018/credentials/examples/v1"
  ],
  "id": "http://example.edu/credentials/3732",
  "type": ["VerifiableCredential", "UniversityDegreeCredential"],
  "issuer": "https://example.edu/issuers/565049",
  "issuanceDate": "2010-01-01T00:00:00Z",
  "credentialSubject": {
    "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
    "degree": {
      "type": "BachelorDegree",
      "name": "Bachelor of Science and Arts"
    }
  }
}

Would it make sense to just return the JSON above from that HTTP URI?

aljones15 commented 1 year ago

For revocation lists the id must return the list itself. Quite often the id of a VC should be a did that when resolved returns the block the VC is in on a ledger.

On Tue, Nov 8, 2022, 05:42 Melvin Carvalho @.***> wrote:

I may be wrong here, but from a simple reading of the text: In the example given there is some JSON and an HTTP URI

HTTP URI: http://example.edu/credentials/3732

json:

{ @.***": [ "https://www.w3.org/2018/credentials/v1", "https://www.w3.org/2018/credentials/examples/v1" ], "id": "http://example.edu/credentials/3732", "type": ["VerifiableCredential", "UniversityDegreeCredential"], "issuer": "https://example.edu/issuers/565049", "issuanceDate": "2010-01-01T00:00:00Z", "credentialSubject": { "id": "did:example:ebfeb1f712ebc6f1c276e12ec21", "degree": { "type": "BachelorDegree", "name": "Bachelor of Science and Arts" } } }

Would it make sense to just return the JSON above from that HTTP URI?

— Reply to this email directly, view it on GitHub https://github.com/w3c/vc-data-model/issues/973#issuecomment-1307000073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACD6CC3N5TYCETP7HQCOSTWHIVADANCNFSM6AAAAAARYXWEQQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

andorsk commented 1 year ago

@melvincarvalho thanks for the thoughts, but I agree with @aljones15 that it shouldn't be dereferencing to the actual document. I would suggest a change to the language of the document, and still possibly a new description field.

The language to me should be probably something like this:

Instead of:

The value of the id property MUST be a single URI. It is RECOMMENDED that the URI in the id be one which, if dereferenced, results in a document containing machine-readable information about the id.

a suggestion would be something along the lines of this:

The value of the id property MUST be a single URI. It is RECOMMENDED that the URI should represent a globally unique identifier specific to the credential.

Unless there's some enforceable property, I'm not sure about the block location ( in the case of a ledger VC ) being an appropriate recommendation. I have a few reasons, but the main one is that it drives an inconsistency in the language when you're talking about what an ID means on ledger vs. off ledger.

Either way, even if it SHOULD reference to a block on a ledger, the current state of the specifications does not make it clear that's the preference and it should be updated IMO.

RieksJ commented 1 year ago

@andorsk nice question. The standard says that an id property "is intended to unambiguously refer to an object, such as a person, product, or organization." The standard does not provide any ground whatsoever for recommending that it should be dereferenceable to some document/description, nor does it provide an example of why it might be useful.

I propose to remove this sentence in its entirety.

There is something else that would be useful though. Considering that there is a difference between 'dereferencing' and 'resolving' (converting an identifier to a descriptive document vs. using that same identifier to learn which entity it is actually referring to), there is a need for guidance on the latter which is currently not provided.

I consider this a serious omission, specifically when an id is being used as the subject identifier in one of the VC's claims, as it leaves verifiers (or anyone else for that matter) clueless about which entity a particular claim was made. Assuming that whoever controls the subject identifier (which in the case of DIDs is easy to establish) is in fact also the subject of that identifier (and hence of the claim), comes with serious problems.

Perhaps we should specify ways that would enable verifiers to identify (and possibly authenticate) the entities to which identifiers refer, e.g., as proposed in w3c/vc-data-model#760.

andorsk commented 1 year ago

@RieksJ I think this is a good point. I will need to think about w3c/vc-data-model#760 and w3c/vc-data-model#959 in more detail, but for the scope of this issue, I agree with removing the dereferencing language w.r.t. the id field and swap it out for something like is intended to unambiguously refer to an object, such as a person, product, or organization or something else of the like.

I also think a description or purpose field still could be useful. Any thoughts there?

RieksJ commented 1 year ago

I would say that every property in a VC should serve a specifically stated purpose, i.e., serve an explicitly stated objective.

For description, I can see there is merit, but not a generic purpose for having it. Adding it might induce a risk that different people will use it for different purposes which might result in weird behaviours (as the issuers had something different in mind as the verifiers assumed).

For purpose, that's pretty much the same. As VCs are merely a set of signed claims, I do not see how an issuer might state a purpose in a generic way that actually has some effects in practice. So I'm not in favor of that.

A specifically stated purpose for which there is currently no support is the identification (and authentication) of the entity that is the subject of identifiers specified in the id fields of claims.

mwherman2000 commented 1 year ago

UPDATED: I believe (the value of) an id field should be interpreted as a unique reference or identifier to a concrete something (aka subject): a person, an organization, a business document (purchase order, invoice, etc.), an education credential, a car, a boat, a house, a software module, a deployed instance of a software module, etc.

Everything else in a decentralized identifier-based software system is addressed by dereferencing or resolving the decentralized identifier to obtain something else (e.g. a DID Document, Revocation List entry, Service Endpoint addresses (via a second level of indirection through the DID Document)).

There are 2 id fields (typically) in a VC. From the VC spec https://www.w3.org/TR/vc-data-model/#identifiers ...

The first identifier is for the verifiable credential and uses an HTTP-based URL. The second identifier is for the subject of the verifiable credential (the thing the claims are about) and uses a decentralized identifier, also known as a DID.

RieksJ commented 1 year ago

@mwherman2000: I do not think that the id field in a VC exists. There are multiple ones, of which only one represents the VC. Others represent, e.g., subjects of claims in the VC, or different entities. I also do not think that that a VC is an instantiation of a real-life object, because a VC is not a representation of a class (or: another abstraction) by a concrete individual (that is) an element (or: illustrative of that class.

What I do think is that there is not only a link between (the value of) any id field and its subject (i.e., the entity to which this value refers), but also that there is an equally important link between this value and its author (i.e., the party that has put the value into the id field), because it is that party that has assigned the value of the id-field to that entity. The DID spec recognizes this by saying that the controller of a DID gets to decide which entity is the subject of that DID.

I also think that parties other than the author SHOULD NOT assume that they know how to dereference the value of an id field, unless they have put some effort in finding out how (and/or verifying that) the author governs the semantics of the id-fields that it authors. For DIDs, there's currently no guidance whatsoever (see issue w3c/did-core#837).

For an id-field that is meant to identify the subject of a claim in a VC, there is also no guidance that other parties can rely on. There is lots of talk about 'holder binding', see e.g., w3c/vc-data-model#789, w3c/vc-data-model#882, w3c/vc-data-model#923, w3c/vc-imp-guide#70, w3c/vc-data-model#959, w3c/vc-data-model#960, w3c/vc-imp-guide#69. There are also discussions on adding roles such as issuee (#942), all of which can be resolved by defining proper mechanisms that verifiers can use to determine which entity is the subject that the author of an id field meant to refer to by the value of that id field, and of which authors can state which is appropriate for a verifier to use in a particular case.

mwherman2000 commented 1 year ago

I also do not think that that a VC is an instantiation of a real-life object, because a VC is not a representation of a class (or: another abstraction) by a concrete individual (that is) an element (or: illustrative of that class.

Perhaps, "real-life" object is too strong an adjective. Perhaps "concrete" object would be better. The main point is that (the value of) an id field is associated with or names the actual "thing" (aka subject) ...not its agent, not the service endpoint of its agent, not a VC, etc.

TallTed commented 1 year ago

Perhaps, "real-life" object is too strong an adjective. Perhaps "concrete" object would be better. The main point is that (the value of) an id field is associated with or names the actual "thing" (aka subject) ...not its agent, not the service endpoint of its agent, not a VC, etc.

The word most often used (at least, in the W3C ecosphere) for this actual "thing" or "concrete" object or "real-life" object is entity. Some do use concept or thing for the same purpose. There are years' worth of reading on the philosophical underpinnings of how and why these different words came to be used for the same (or very similar) things.

The value of an id field identifies a specific entity, not an abstraction nor relative of that entity, as intended by the, cough, entity that populated that field — though the type of their chosen specific entity may itself be conceptual, an agent, a service endpoint, a VC, or any other (sub-)class of identifiable "thing" in the universe.

David-Chadwick commented 1 year ago

There is a bug in the current DM in that the id field in the VC is not actually the id of the verifiable credential, but is the id of the credential. This was brought home to me during the JFF Plugfest. I had always believed that the id was equivalent to the serial number of a PKC and was unique for each VC (which should be true if it was the id of the VC). But in the plugfest people were issuing multiple verifiable credentials for the same credential and keeping the id constant, because the only difference was the validity time of the cryptographic proof. The credential remained the same and therefore kept the same id. If we want to have an id for a verifiable credential then it must be part of the proof property or part of the JWT, and not part of the credential

jandrieu commented 1 year ago

There is a bug in the current DM in that the id field in the VC is not actually the id of the verifiable credential,

This is incorrect, but probably because there are multiple identifiers.

{
  "@context" : "https://w3id.org/credentials/v1",
  "id" : "did:ex:id1",
  "credentialSubject" : {
    "id" : "did:ex:id2",
    "hasCredential" : {
        "id" : "did:ex:id3",
        "credentialType" : "bachelor of science"
    }
}

In that example

The problem is one of understanding the data model of claims, not that of the VC.

We need to better explain this, for sure, but bad data modeling is going to always be a thing. Better data modeling resolves this problem, without needing to change the VCDM at all.

I think @David-Chadwick you are imagining that a VC (with its ID) contains a fully formed credential (with a separate id). However, if you look at Figure 5 in Section 3 https://www.w3.org/TR/vc-data-model/#credentials, a VC contains metadata, claims, and proofs. It doesn't contain a credential. Rather the credential becomes verifiable because there is a proof. So to the notion of "credential" as defined by VCs, it is not a separate thing with its own identifier. (Although you can model such a separate credential as in my example.)

The problem, of course, is that many communities, including educators, see "credential" as a well-known and specific thing. By which they mean the degree or certification earned. That "credential" is not the same as "credential" as defined in the VCDM. It's an unavoidable name collision that we have to overcome by better examples and explanations of that distinction.

David-Chadwick commented 1 year ago

@jandrieu

"did:ex:id1 is the identifier of the verifiable credential. Full stop."

No its not. Full stop. It's the identifier of the credential. It is part of the metadata of the credential as figure 5 clearly shows. Thankyou for pointing that out. It is not metadata of the verifiable credential as you wrongly assert.

Each proof that makes the credential into a verifiable credential will have its own parameters and metadata, such as validity time (of the signature - which is very different from validity time of the credential) and an id (as in the serial number of a PKI). When the same credential is turned into a verifiable credential at different times, then the VCs will have different ids and different validity times as they are different objects. But the embedded credential will have the same validity time and identifier.

jandrieu commented 1 year ago

@David-Chadwick Why do you think there is a "credential" separate from the "verifiable credential"? I'm honestly curious where that notion is based.

If you look at example 4 https://www.w3.org/TR/vc-data-model/#example-usage-of-the-id-property, you'll see that the Credential, Verifiable Credential (with proof), and the Verifiable Credential (As JWT) all use the same identifier. Because there is no separate credential inside a VC. A credential is transformed into a verifiable credential by adding a proof.

This construct was created in recognition of the usefulness of JSON-LD credentials that don't have proofs. It was not intended, and has never been expressed or documented, to my knowledge, as a credential being a separate thing within the Verifiable Credential.

We did enable exactly that pattern in the Learning and Employment Record (LER) Wrapper. https://www.t3networkhub.org/resources/public-specification-for-learning-and-employment-record-ler-wrapper-and-wallet The id of the VC is most definitely NOT the id of the wrapped credential.

You say

But the embedded credential will have the same validity time and identifier.

There is no embedded credential; there are only claims expressed in JSON-LD. In those claims you may state that there is a credential that has an identifier and has been granted to the subject. You don't have to have such an identifier, but you can.

If you follow the JSON-LD data model, you might discern that the "id" of an object is the identifier of the object in which the property appears. The top-level identifier of a VC is always the identifier of that data object, i.e., of the VC.

If you want to state the identifier for a credential expressed in a VC, that is done in the claims, in the "credentialSubject" property, probably using something like the pattern I already described.

If you treat the top-level "id" property as anything other than the identifier of the VC, you would break JSON-LD semantics.

David-Chadwick commented 1 year ago

@jandrieu

"Why do you think there is a "credential" separate from the "verifiable credential"? I'm honestly curious where that notion is based."

Because we agreed during DM1.1 that if you take a credential, put different proofs on it, JWT or JSON-LD, then verify each VC and remove its proof, you will end up with the same credential that you started with. Therefore it follows that credentials and verifiable credentials are different entities, and have their own lifetimes and metadata. Thus the id of the VC must be different from the id of the C. Consequently some of the existing mapping rules for JWT are wrong (including the date/time mappings). I think this is a major discussion item that we should have at a VC WG meeting (very soon!). I don't think this is a JSON-LD issue, but rather a conceptual one. Namely is the VC object a new and separate object from a C object (even though the former was created from the latter, they are not identical).

mwherman2000 commented 1 year ago

I believe I concur with @jandrieu. If we cast the credentialSubject as the "inner credential" or "business credential", then the entire credential (if it contains a proof) is the verifiable credential. If we consider the verifiable credential minus the "inner credential", this is the envelope containing the "inner credential".

Further, two envelopes can be used to encase 2 copies (one each) of the same "inner credential" (e.g. a purchase order or invoice). This results in two different verifiable credentials (with 2 different "outer ids") but the same "inner credential" - each copy with the same "inner id".

Here's an interesting tutorial that illustrates this concept: https://www.youtube.com/watch?v=kM30pd3w8qE&list=PLU-rWqHm5p445PbGKoc9dnlsYcWZ8X9VX&index=1

David-Chadwick commented 1 year ago

@mwherman2000 Your argument applies equally well to whatever construct is the inner credential. You appear to want the credential subject property to be the inner credential, whereas I want the credential object to be the inner credential. The difference of course lies in whether the credential metadata properties issuanceData, type, @context etc are part of the inner credential or not. Jo appears to be saying they are not, whilst I am saying they are.

mwherman2000 commented 1 year ago

@David-Chadwick I'm not exactly following your terminology. Can you elaborate? ...perhaps with an example? Here's one example I can offer as a starting point...

Sample Verifiable Credential

{
  "id": "did:color:verifiable:red",
  "@context": [
    "https://www.w3.org/2018/credentials/v1"
  ],
  "type": [ "VerifiableCredential", "Color" ],
  "issuer": "did:org:111-222-333",
  "issuanceDate": "2017-01-12T00:00:00Z",
  "expires": "2017-04-22T00:00:00Z",
  "credentialSubject": {
    "id": "did:color:red",
    "claims": {
        "red": "255",
        "green": "0",
        "blue": "0"
    }
  },
  "proof": {
    "type": "RsaSignature2018",
    "created": "2017-01-12T21:19:10Z",
    "proofPurpose": "assertionMethod",
    "verificationMethod": "https://example.com/issuers/keys/1",
    "jws": "eyJhbGciOiJSUzI1NiIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..TCYt5XsITJX1CxPCT8yAV-TVkIEq_PbChOMqsLfRoPsnsgw5WEuts01mq-pQy7UJiN5mgRxD-WUcX16dUEMGlv50aqzpqh4Qktb3rk-BuQy72IFLOqV0G_zS245-kronKb78cPN25DGlcTwLtjPAYuNzVBAh4vGHSrQyHUdBBPM"
  }
}

Using my terminology, the "inner credential" or "business credential" or "payload" is...

{
    "id": "did:color:red",
    "claims": {
        "red": "255",
        "green": "0",
        "blue": "0"
    }
}

The Verifiable Credential Envelope is...

{
  "id": "did:color:verifiable:red",
  "@context": [
    "https://www.w3.org/2018/credentials/v1"
  ],
  "type": [ "VerifiableCredential", "Color" ],
  "issuer": "did:org:111-222-333",
  "issuanceDate": "2017-01-12T00:00:00Z",
  "expires": "2017-04-22T00:00:00Z",
  "credentialSubject": {

  },
  "proof": {
    "type": "RsaSignature2018",
    "created": "2017-01-12T21:19:10Z",
    "proofPurpose": "assertionMethod",
    "verificationMethod": "https://example.com/issuers/keys/1",
    "jws": "eyJhbGciOiJSUzI1NiIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..TCYt5XsITJX1CxPCT8yAV-TVkIEq_PbChOMqsLfRoPsnsgw5WEuts01mq-pQy7UJiN5mgRxD-WUcX16dUEMGlv50aqzpqh4Qktb3rk-BuQy72IFLOqV0G_zS245-kronKb78cPN25DGlcTwLtjPAYuNzVBAh4vGHSrQyHUdBBPM"
  }
}
mwherman2000 commented 1 year ago

Alternatively, following the Structured Credential model more closely (https://www.youtube.com/watch?v=kM30pd3w8qE&list=PLU-rWqHm5p445PbGKoc9dnlsYcWZ8X9VX&index=1), the envelope and proof can be separated:

Verifiable Credential Envelope

{
  "id": "did:color:verifiable:red",
  "@context": [
    "https://www.w3.org/2018/credentials/v1"
  ],
  "type": [ "VerifiableCredential", "Color" ],
  "issuer": "did:org:111-222-333",
  "issuanceDate": "2017-01-12T00:00:00Z",
  "expires": "2017-04-22T00:00:00Z",
  "credentialSubject": {

  },
  "proof": {

  }
}

Verifiable Credential Proof

{
    "type": "RsaSignature2018",
    "created": "2017-01-12T21:19:10Z",
    "proofPurpose": "assertionMethod",
    "verificationMethod": "https://example.com/issuers/keys/1",
    "jws": "eyJhbGciOiJSUzI1NiIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..TCYt5XsITJX1CxPCT8yAV-TVkIEq_PbChOMqsLfRoPsnsgw5WEuts01mq-pQy7UJiN5mgRxD-WUcX16dUEMGlv50aqzpqh4Qktb3rk-BuQy72IFLOqV0G_zS245-kronKb78cPN25DGlcTwLtjPAYuNzVBAh4vGHSrQyHUdBBPM"
}
David-Chadwick commented 1 year ago

@mwherman2000 Thankyou for your example. Here are my comments

  1. Your model only works for JSON-LD proofs and not for JWT proofs as there is no proof property in the latter
  2. issuanceDate is not the date of issuing of the VC, but is the date that the credential was issued. So it is metadata of the credential and not of the verifiableCredential/proof
  3. the VC (proof) needs to have its own date of issuance, which you have in your example with "created"
  4. the same applies to the expiry date of the verifiableCredential, since no crypto lasts for ever. So your proof must have an "expires" property as well as a "created" property, which currently it does not have. The actual credential may or may not have an expiry date (e.g. degree certificate)

So now we come to the tricky bits i) is type the type of the credential or verifiable credential? I would argue it is the credential. The type of verifiable credential may be JWT proofed credential, or JSON-LD proofed credential ii) is id the id of the credential or verifiable credential? Given that all the other properties have been shown to be properties of the credential and not the verifiable credential, then the id must also be the id of the credential. The id of the verifiable credential must be added to the proof object as an extra parameter. (Note that the JFF plugfest has already used this interpretation since multiple sequential VCs created from the same underlying credential have all been given the same id.

mwherman2000 commented 1 year ago

@David-Chadwick can you mark up some of the examples to more precisely illustrate your points? For now, let's limit the scope to JSON/JSON-LD based VCs. Some responses: i) re: type. I agree that it is the type of the "inner credential" (let's agree to stay with one set of terminology ...the terms used in the examples) i) re: "The type of verifiable credential may be JWT proofed credential, or JSON-LD proofed credential". I have not seen the "type" used for to specify "JWT proofed credential, or JSON-LD proofed credential" in any existing examples. Usually, I've interpreted the following pattern to say this JSON thing is a VC ...with a (verifiable credential) subtype (e.g. Color): "type": [ "VerifiableCredential", "Color" ] ...putting the about 2 points together: the subtype is the type of "inner credential" and "type" property says "this is the type designation for the VC that embeds the "inner credential"". ii) which "id" are you referring to? ...the "id" in the "inner credential" is the "id" for the "inner credential". ...the "id" in the "envelope" is the "id" for the verifiable credential. ii) the "proof" is the proof for the VC (i.e. the envelope and embedded "inner credential") because the "id" is embedded inside the VC envelope. See the examples above.

David-Chadwick commented 1 year ago

the type of proofing is already specified (for the JSON-LD proofs) Its "RsaSignature2018" in your example. So I am happy that this aspect is already covered. there is only one 'id' in your examples so it should be obvious which one I am referring to. What you are calling the envelope I am referring to as the metadata. Either way this is information about the credential and not about the proof. "the "proof" is the proof for the VC". No, the proof is for the credential. credential + Proof = verifiable credential

mwherman2000 commented 1 year ago

Screenshot_20221201-114124

Here's a screenshot of my post from above @David-Chadwick. There are two "id" properties in my example.

David-Chadwick commented 1 year ago

That is weird. Because this is what I see in git and have copied below

{
  "id": "did:color:verifiable:red",
  "@context": [
    "https://www.w3.org/2018/credentials/v1"
  ],
  "type": [ "VerifiableCredential", "Color" ],
  "issuer": "did:org:111-222-333",
  "issuanceDate": "2017-01-12T00:00:00Z",
  "expires": "2017-04-22T00:00:00Z",
  "credentialSubject": {

  },
  "proof": {

  }
}

As you can see there is only one "id" above. In your screen shot your also have credentialSubject.id - this is clearly the id of the credentialSubject, whereas the first "id" is the one we are debating. Is it the id of the credential or of the verifiable credential?

mwherman2000 commented 1 year ago

That is weird. Because this is what I see in git and have copied below

@David-Chadwick It's best to read my entire posts from the beginning to the end; else you will lose/miss the context. What you've shown above is an example of a VC Envelope ...it is also correct.

Reread:

mwherman2000 commented 1 year ago

In your screen shot your also have credentialSubject.id - this is clearly the id of the credentialSubject, whereas the first "id" is the one we are debating. Is it the id of the credential or of the verifiable credential?

The value first "id" is the identifier for the particular VC (shown in its entirety at the top of this post: https://github.com/w3c/vc-data-model/issues/973#issuecomment-1333097171).

The second "id" (in the same VC definition in the referenced post) is the identifier for the "inner credential" or "business credential" or "payload" ...also known as the "credentialSubject" "id".

NOTE: I think it's worth mentioning that the JSON text we're talking is a textual serialization of a VC ...a technical JSON-based textual serialization of a particular VC. So terms (property names) like "credentialSubject" are a technical/implementation terms chosen at the time that the JSON serialization for a VC was agreed upon. "credentialSubject" is not the term I would use when I'm talking to someone (an architect or developer) using the King's English ;-) ...I use terms like "inner credential" or "business credential" or "payload". I hope this finally clarifies things.

David-Chadwick commented 1 year ago

This is where we disagree. I assert it is the id of the credential, and has been used as such by the JFF Plugfest. Multiple VCs have been created from this credential, all with the same "id", but clearly each VC is different and a separate object. Its "id" is equivalent to the serial number of an X.509 PKC and should be in the proof section

dlongley commented 1 year ago

@David-Chadwick,

This is where we disagree. I assert it is the id of the credential, and has been used as such by the JFF Plugfest. Multiple VCs have been created from this credential, all with the same "id", but clearly each VC is different and a separate object.

To me, that just sounds like a bug in some software in the plugfest.

David-Chadwick commented 1 year ago

@dlongley In which case, what is the "id" of the credential?

dlongley commented 1 year ago

@David-Chadwick,

In which case, what is the "id" of the credential?

I more or less agree with @jandrieu's comments here:

https://github.com/w3c/vc-data-model/issues/973#issuecomment-1331011296 https://github.com/w3c/vc-data-model/issues/973#issuecomment-1331467747

A credential is just a verifiable credential without any proof. A credential can optionally have an id property -- and if you add a proof property to the credential, it becomes a verifiable credential. That all meshes and seems reasonable to me.

I think there may just be a simpler explanation behind what happened in the JFF plugfest: a bug in someone's software or use of it. I don't think it makes any sense to issue "the same" (having the same ID) credential more than once ... so that's a strong signal to me that there was a bug somewhere.

I can't imagine such a thing working properly once you add things like credential status to a credential and you hand "the same" credential to two different "issuees". If you revoke one, you revoke it for both of them? What is the use case there? It just sounds like an error to me that people didn't pay attention to -- likely because the focus was only on the goal of interop. Some measure of quality control on the issuer's end should stop that sort of thing from happening in a production system.

David-Chadwick commented 1 year ago

But if you have a different mental model in mind, one in which a credential is a valid standalone object in its own right, i.e. its a statement made by an issuer that cannot be verified, then it has its own metadata (as figure 5 shows). The credential metadata comprises the validity period of the credential, the id of the credential, the @context of the credential, the schema of the credential, and the type of the credential. When this credential is proofed, say with a JWT, then a new validity time and id are added to the proof. These are the metadata of the proofed credential. Obviously the crypto has a different validity time to that of the credential, and the VC should have its own id as well. All this metadata should be in the proof property. Now with this mental model in mind there is no bug in JFF. The credential can be sequentially turned into VCs, each with different validity times and ids, whilst the credential has the same id in all cases. The revocation information is metadata of the VC, not of the C. I think this discussion is a fundamental issue that needs discussing in the WG.

andrewhughes3000 commented 1 year ago

Although not precisely the same, ISO 18013-5 has a similar concept to what David described. There are data elements and a separate Mobile Security Object comprised of salted hashes, keys and other data integrity stuff. The "driving license" credential is the data elements including DL# and issuance/expiry dates. The MSO has its own issuance/expiry dates and is re-issued as needed independent of the DL expiry and DL#.

David-Chadwick commented 1 year ago

@andrewhughes3000 Glad I am not on my own with my mental model. The current VCDM half supports this model and half doesn't as it takes metadata about the credential and then treats some of it as metadata about the VC.

dlongley commented 1 year ago

I do think that's a useful model and concept for some issuers to employ -- but I think it should be done internally. In other words, it seems like that model would be better served via some internal ID / reference ID rather than exposing that information in the VC itself. It seems like a leakage of implementation details to me.

I imagine it would further complicate the VCDM when considering any number of metadata items that might then need duplication, e.g., credential status -- do we now need both credential status and verifiable credential status tracking? I think that whether or not certain elements of the VC are present or not (proof 1, proof 2, selectively disclosed fields, etc.) should not each result in different identifiers being assigned to the "new object" (each possible combination constituting a "new object").

Rather, I think we're always talking about the same object from the perspective of any party outside of the issuer (i.e., they cannot know the details of how the issuer is implemented). It's just a question of whether or not that object is verifiable, verifiable with proof type 1, or proof type N, whether fields X are revealed or not and so on. I think if there is important metadata for external (or internal) proofs, it should be present in the proof section -- and I'd expect there to be different metadata for each proof for VCs with multiple proofs (which is already true today).

dlongley commented 1 year ago

As another point of reference -- there's a discussion in the W3C CCG VC-API group about having internal credential references of a slightly different sort: https://github.com/w3c-ccg/vc-api/issues/126 -- particularly for cases where credential IDs should not be used at all (potentially for increased privacy cases). The point is that there are other needs for "reference IDs" to credentials that aren't expressed in the credential itself (for various reasons).

David-Chadwick commented 1 year ago

@dlongley Sorry but I disagree with you. The issue comes down to this:

bumblefudge commented 1 year ago
  • can or does the credential exist without a proof? (e.g. could an issuer issue a W3C credential i.e. a VC without a proof?

I think many in this thread are operating on different definitions of this "credential", so maybe we should tease out those differences. Maybe instead of debating the existence we should step back a little and state what we think a "Credential" does or can do. Can a credential be "issued" or does the issuer/holder/verifier model only make sense for a specific verifiable credential? If the latter, what are the differences between how a C and how a VC "exist" 😅

or could a wallet (trusted by the verifier) strip off the proof and just give the credential to a verifier?)

Here I was assuming a credential is a kind of platonic ideal that only exists inside the issuer's records or "before" the VC, and that the closest anyone else could come to "recreating" that pre-existence was the verifier. I was thinking of a credential as fundamentally local and non-portable, as it were (a record in a system), and a VC a kind of "export format"-- that might just be a bias from the use-cases I think of as the "real" use-cases for VCs?

To put it more bluntly, if the wallet (or a 4th party) can reconstitute the Credential, I feel like we're entering a different definition of the Credential than I thought I had before! I thought the whole point of VCs was that verifiers and 4th parties can recreate something that is almost the credential, but an approximation or guess. My humanities-brain is buzzing with Proustian Madeleines and Wellsian Rosebuds.

  • if so, what is the metadata of the credential? This metadata must be independent of the proof.

Going back to @dlongley 's example of credentialStatus, I feel like we get into spicy waters. Maybe some properties can exist in both credential and VC alike, but StatusList2022 feels like it was written specifically for the statuses of VCs only, and using StatusList2022 to track the status of the credential instead breaks that whole mental model. So that seems like a piece of metadata that the credential can't have, to me at least-- two VCs sharing one credentialStatus feels wrong to me.

Sidenote, maybe a tracking issue on StatusList2022 should be opened to make this assumption more explicit if the concept of "credential" is here to stay and survives the process of discernment I seem to be plus-oneing

David-Chadwick commented 1 year ago

credentialStatus is clearly mis-named as it is the verifiableCredentialStatus and appertains only to VCs that can be revoked. Short lived ephemeral VCs do not have this property, neither do credentials. I think it is very easy to differentiate between credential metadata and verifiable credential metadata. VC metadata applies to everything to do with the cryptographic proofing (and revoking) Credential metadata applies to everything to do with the credential. Now it may be possible to carry some of the credential's metadata over into the VC's metadata, but in so doing the VC metadata properties should have different names to the C's metadata and there should be a bi-directional algorithm that defines this conversion (similar to what is done when a JWK proof is created).

As to what a credential is - these are the statements made by some entity (the issuer) about an entity (the subject) to an entity (the issuee). When they have been cryptographically protected they become a verifiable credential.

bumblefudge commented 1 year ago

Hehe, stating clearly that you're confident of your mental model isn't going to get us to the differences :D

Here's one question that can hopefully tease some out: is it a design goal (or a requirement) that Credentials be 100% reconstructable/roundtrippable from VCs if all the necessary additional resources like @Context files and schemas have been dereferenced? My jokes about Proust and Citizen Kane were roundabout ways of asking this. Your reference to "bidirectional" made me think you are taking this as self-evident but I'm not sure everyone agrees to that requirement, necessarily?

David-Chadwick commented 1 year ago

We already have one example of where the bi-directional transformation isn't possible - converting issuanceDate into JWT nbf claim and back again. Therefore the rules need to state how this should be handled. e.g. convert issuanceDate into the nearest nbf, keep the issuanceDate in the credential, then throw the nbf away after validating the signature. Once we have all agreed on the model, then producing the rules should be relatively straight forward. But currently we have not agreed on the (mental) model.

jandrieu commented 1 year ago

credentialStatus is clearly mis-named as it is the verifiableCredentialStatus and appertains only to VCs that can be revoked. Short lived ephemeral VCs do not have this property, neither do credentials. I think it is very easy to differentiate between credential metadata and verifiable credential metadata. VC metadata applies to everything to do with the cryptographic proofing (and revoking) Credential metadata applies to everything to do with the credential. Now it may be possible to carry some of the credential's metadata over into the VC's metadata, but in so doing the VC metadata properties should have different names to the C's metadata and there should be a bi-directional algorithm that defines this conversion (similar to what is done when a JWK proof is created).

As to what a credential is - these are the statements made by some entity (the issuer) about an entity (the subject) to an entity (the issuee). When they have been cryptographically protected they become a verifiable credential.

David, I think credentialStatus is not misnamed as much as you have settled into a different conceptual model than was used to write the specification. I think yours is a coherent model. That is, you're not wrong when discussing your sense of what a credential is. It's just that notion of a credential isn't the same as that which drove the VC work.

There is only a single "credential" (as defined in the spec) in a Verifiable Credential. It is not a separate digital object, it is the part of the verifiable object that contains the claims.

That is, VCs are not a composition of a credential and a verifiable credential; VCs don't wrap wholly formed credentials (as defined by the spec). Rather, VCs are a transmutation of a credential as expressed by a set of claims. Credentials become VCs when proofs are added. They do not retain a separate existence, identifier, or other metadata. The credential BECOMES the VC. As such, property names like credentialSubject and credentialStatus refer to the exact same credential: the set of claims. There's no separate verifiableCredentialSubject or verifiableCredentialStatus because there is only one credential and they have the same value and meaning with or without the proof. In fact, in most issuance pipelines the credential is generated in a separate step from signing. In that step, the credential is the "proto-" VC, which is passed to the signing software to actually issue the VC.

You can wrap credentials as we do in the LER Wrapper, where the credential is an exogenously created thing (like a transcript, degree, etc.,, in any format) that is literally wrapped by a VC. So, it's possible to memorialize the kind of credential you want, within the claims of a VC. But that is not making the VC that other credential; it is making a VC that represents that credential.

I think this is where the fundamental disconnect is.

In the mental model behind VCs, "credentials" are sets of claims that become "Verifiable Credentials" when a proof is added. The credential is the Verifiable Credential. As such, credentials and VCs are simply statements by an issuer about a subject. Without the proof, they are not verifiable, so they are called "credentials". With a proof, they are verifiable and, hence, called "Verifiable Credentials".

Domains that produce credentials (such as education) already have (and will continue to innovate about) domain-defined notions of what a credential is. In many cases, the "credential" is the physical object, in others, the "credential" is the attainment earned, like a Bachelor's degree. There's a rich and complicated semantic dance about whether or not the credential is the abstract thing represented, e.g., by the sheepskin, or is it the sheepskin itself the credential.

With VCs, there is no technical ambiguity, VCs are a serialized set of claims with some form of integrity proof. What is in the VC is a set of statements, represented as claims. Those statements are the credential whose authenticity, authorship, and timeliness is "verifiable".

We can't get away from the essential ambiguity with real-world credentials when people attempt to model them as VCs, but at least within the specification and our own work, there is a clear definition. Perhaps misunderstood by many, but still a definition within the context of Verifiable Credentials.

mwherman2000 commented 1 year ago

Credentials become VCs when proofs are added.

@jandrieu If I read this statement literally, it is indeed unfortunate for regular people because it says that any particular set of claims (only) is not considered a Credential. It implies a Credential is only a Credential if it has a bunch of VC decorations surrounding the set of claims.

It says the following is not a credential (which is contrary to everyday usage).

image

David-Chadwick commented 1 year ago

@jandrieu Thankyou for pointing out the differences in our mental models. But I am having difficulty understanding your mental model when you write

Credentials become VCs when proofs are added. They do not retain a separate existence, identifier, or other metadata. The credential BECOMES the VC

Your first sentence is exactly the same as my mental model. Your second sentence is where we diverge and I have difficulty comprehending it. In our implementation the credential does have its own existence. The RP receives the VP from the wallet, passes it to our backend verifier service, which verifies the VP and VC and returns the credential to the RP for it to process at the application level. The proof has gone. The RP is not interested in it. (If the verification fails the RP is given nothing except an error code because we cannot believe anything that the VP/VC says.) So I would be interested to learn how your implementation implements your mental model.

TallTed commented 1 year ago

[@jandrieu] Credentials become VCs when proofs are added.

[@mwherman2000] If I read this statement literally, it is indeed unfortunate for regular people because it says that any particular set of claims (only) is not considered a Credential. It implies a Credential is only a Credential if it has a bunch of VC decorations surrounding the set of claims.

Apparently you read a different English than I do, one that is neither US, UK, nor CA-localized, and one which I am fairly confident no-one else on this thread shares.

Fortunately for you, the rest of us, and all sorts of "regular people", @jandrieu said "[Non-Verifiable] Credentials become [Verifiable Credentials] when proofs are added."

In other words, [Non-Verifiable] Credentials are [Non-Verifiable] Credentials before/until they get a Proof (the "Verifiable Credential decorations", as you call them), at which point they become Verifiable Credentials.

AFlowOfCode commented 1 year ago

I was also hoping for some clarity on the top-level verifiableCredential.id property as mentioned in the first post. In my case it seems to contradict the herd privacy provided by a status list's bit array approach.

If the id is a URI specific to one particular VC and a server sees a call to that URI followed by a call to a status list it seems to me those 2 calls could justifiably be correlated to a verifier checking on a particular holder. I was under the impression that was intended to be avoided by the use of the status list format. Therefore since the ID property is optional I have removed it.

After reading through this issue I still do not understand why I would want a URI ID for a specific VC that was issued. It would make more sense to me if it identified a type of credential of which multiple holder-specific copies could be issued, but then again that seems to be covered by credentialSchema.

I can see why an issuer may want to assign a unique ID if it is maintaining knowledge of VCs it has issued, just like it would do with any database record, but I don't quite understand why that needs to be a public URI.

brentzundel commented 1 year ago

If the id is a URI specific to one particular VC and a server sees a call to that URI followed by a call to a status list it seems to me those 2 calls could justifiably be correlated to a verifier checking on a particular holder. I was under the impression that was intended to be avoided by the use of the status list format. Therefore since the ID property is optional I have removed it.

this concern is precisely the reason the verifiableCredential.id property is not required.

AFlowOfCode commented 1 year ago

If the id is a URI specific to one particular VC and a server sees a call to that URI followed by a call to a status list it seems to me those 2 calls could justifiably be correlated to a verifier checking on a particular holder. I was under the impression that was intended to be avoided by the use of the status list format. Therefore since the ID property is optional I have removed it.

this concern is precisely the reason the verifiableCredential.id property is not required.

It's certainly reasoning that argues against using it, but not reasoning that contributes to understanding a proper use case. Yet if it's indeed precisely the reason, may I humbly suggest describing said reasoning explicitly in the VC Data Model spec?

Nowhere does it mention that a URI verifiableCredential.id should be omitted if implementing a status list since they are inherently in conflict (by increasing correlatability) when simultaneously implemented in this way. Only general correlation maxims are stated as a reason against its use, but this is a concrete example that could potentially be overlooked or at minimum increase confusion about a URI ID's use case (as it has for me). I did actually consider using the StatusList approach for this ID property as well, yet in the end I saw no useful reason to bother.

Nevertheless, it is beside the point of my contribution to this issue. Though that specific reasoning may be obvious to implementers and spec authors (even if it remains unstated in the spec), it does not clarify the questions brought up in this issue. It rather adds fuel to the logic behind the necessity of an issue which attempts to clarify appropriate usage of this property in the first place.

Personally I still have no understanding of why it would ever need to be a URI identifying a unique credential. But since one can easily exercise the "not required" nature of the property, it becomes something of a throwaway subject.

RieksJ commented 1 year ago

If the id is a URI specific to one particular VC and a server sees a call to that URI followed by a call to a status list it seems to me those 2 calls could justifiably be correlated to a verifier checking on a particular holder. I was under the impression that was intended to be avoided by the use of the status list format. Therefore since the ID property is optional I have removed it.

That's only if you assume that every claim in the VC is about the same entity and that entity is holding the VC. When I read the VCDM, it (explicitly) states that neither of these assumptions are generally true. That's why a credential-id is not the same as a subject-id.