openid / OpenID4VCI

64 stars 18 forks source link

Supporting Credential Versioning #278

Open tplooker opened 6 months ago

tplooker commented 6 months ago

Problem Statement

Within OpenID4VCI, there are multiple situations which may result in a wallet needing to request more than one copy of a specific credential, for example:

The challenge to solve I would posit, is how does a wallet determine whether two copies of a credential are equivalent e.g. that they have consistent claims / claim values?

Why? Because without such a mechanism defined, wallets will find it difficult to adequately manage credentials in situations where they have multiple copies.

One Possible Solution - Leave it to the wallet

A wallet having made two separate credential requests to an issuer and having obtained two copies of the same credential could simply inspect each credential and try and determine whether they are equivalent, this would involve something like:

The Problems with this approach

Another possible solution - Issuer manages the credential version (Preferred Solution)

Instead of the wallet having to determine whether two copies of a credential are of “the same version”.

  1. Changes to credential endpoint response - Add the ability for issuers to return a version identifier in the credential response from the credential endpoint.
  2. New credential updates endpoint - Add a new “.../credential/updates” endpoint for issuers to implement that enables a wallet to query an issuer to check if the version of credential it has is the latest.

This solution means the wallet can simply store the “version_id” along side an obtained credential and use it to compare with other copies to determine whether they are equivalent.

Changes to the credential endpoint response

Add a new “version_id” attribute which has a string based value, that is returned along side the issued credential in the credential response. The issuer is free to manage this identifier however it sees fit (e.g. compute on the fly or store), the identifier should be treated as an opaque identifier by the wallet, meaning the wallet shouldn’t assume any particular structure/form to its value beyond it being a string.

From

{
   "format": "jwt_vc",
   "credential" : "LUpixVCWJk0eOt4CXQe1NXK....WZwmhmn9OQp6YxX0a2"
}

To

{
   "version_id": "9d389vc3u4948vj4",
   "format": "jwt_vc",
   "credential" : "LUpixVCWJk0eOt4CXQe1NXK....WZwmhmn9OQp6YxX0a2"
}

New credential updates endpoint

Add a new credential issuer based endpoint which is an OAuth2 protected resource, that allows a wallet to check if the version of a particular credential they have is the latest.

Example Request

POST /credential/updates HTTP/1.1
Host: server.example.com
Content-Type: application/json
Authorization: BEARER czZCaGRSa3F0MzpnWDFmQmF0M2JW

{
   "version_id": "9d389vc3u4948vj4",
   "format":"mso_mdoc",
   "doctype":"org.iso.18013.5.1.mDL"
}

Example Successful Response

HTTP/1.1 200 OK
Content-Type: application/json

{
   "version_id": "9d389vc3u4948vj4"
}

Example Error Response

HTTP/1.1 400 OK
Content-Type: application/json

{
   "version_id": "ae48vh2184tghvan"
}

Note - Im not convinced on the usage of the 400 status code here.

tplooker commented 6 months ago

For awareness this issue is tangentially related to #91, #108, #131

tplooker commented 6 months ago

Also here is a google slides version of this proposal which was presented on the DCP WG a couple of weeks ago https://docs.google.com/presentation/d/1SI9W7JpxpG4pDemagXCP4Ba4phy1o-XneaAEfpmSKjQ/edit#slide=id.g2b74097bcd0_0_28

David-Chadwick commented 6 months ago

So why cannot the issuer insert the unique versionID into the issued credential, then the wallet can check this value to ensure that two credentials are identical (i.e. have same issuer/versionID combination). Then no new endpoint is needed. And checking by the wallet is trivial.

tplooker commented 6 months ago

So why cannot the issuer insert the unique versionID into the issued credential.

Because if the ID forms part of the credential, then it is likely disclosable to a relying party which creates a potential harmful source of correlation.

then the wallet can check this value to ensure that two credentials are identical (i.e. have same issuer/versionID combination)

Just checking, why can't the wallet perform this checking using the approach I proposed? The only difference between our approaches is that in my proposal the ID is returned along side the credential in the credential response, rather than in the credential itself, for the reason I identified above. Therefore this comparison check is the same for both of our proposals?

Then no new endpoint is needed.

The version ID for a credential allows a wallet to determine if two credentials are of equivalent versions. It does not solve for how the wallet checks with an issuer whether the current version they have, is the latest version available.

I believe you are proposing that when the wallet goes to check for updates for a credential with an issuer, instead of my approach, they can simply send a new credential request, get the credential in response, along with the version ID and compare it to a copy they have obtained previously to determine if the version has changed. Is that accurate?

If so the biggest draw back with that approach is that it requires the issuer issuing a credential every time the wallet wants to check for updates AND in the case there were no updates, the credential that was issued might be redundant. Instead to avoid this, I believe we need a way for a wallet to say to an issuer "hey for this credential, I've got this version, are there any updates?" with a simply yes no response instead of a credential being issued.

David-Chadwick commented 6 months ago

If it's a SD credential then the versionID wont have to be disclosed. If it's a plain credential then surely the signature will form a correlating handle between two identical credentials. So correlation cannot be an argument against inserting the versionID in the credential. (Note there might also be a credential ID in the credential as well, which is another correlating handle.)

I believe you are proposing that when the wallet goes to check for updates for a credential with an issuer, instead of my approach, they can simply send a new credential request, get the credential in response, along with the version ID and compare it to a copy they have obtained previously to determine if the version has changed. Is that accurate?

Yes. Most credentials will only rarely or never be updated e.g. change of name, employer, role etc. In these cases the user will know that they have been updated, and can initiate the issue new credential command when required. For other cases of rapidly changing data eg. bank account balance, the user will always ask for a new credential. So I do not buy your argument for a new endpoint. I think requesting a new credential to be issued is good enough.

tplooker commented 6 months ago

If it's a SD credential then the versionID wont have to be disclosed

Can we agree that the versionID is a piece of information about a credential that is only ever intended to be used and available to a wallet and not a relying party? If so why put this information in a credential? It just puts it at risk of being accidentally disclosed to a relying party.

As you highlight too, your solution relies upon the credential where the version ID is present needing to support selective disclosure, which isn't reliable or universally true for the credential formats supported by OpenID4VCI.

If it's a plain credential then surely the signature will form a correlating handle between two identical credentials. So correlation cannot be an argument against inserting the versionID in the credential. (Note there might also be a credential ID in the credential as well, which is another correlating handle.)

IMO this isn't correct, there are multiple ecosystems now that issue multiple copies of the same credential with different issuer signatures on each copy of a credential so that each copy can be used uniquely with a relying party without fear of correlation via the issuers signature. If you re-introduce the versionID into the credential here, you destroy this privacy feature.

Yes. Most credentials will only rarely or never be updated e.g. change of name, employer, role etc. In these cases the user will know that they have been updated, and can initiate the issue new credential command when required. For other cases of rapidly changing data eg. bank account balance, the user will always ask for a new credential. So I do not buy your argument for a new endpoint. I think requesting a new credential to be issued is good enough.

There appears to be a huge number of assumptions being made hear about behaviour. Attempting to bucket credentials into two categories with regard to updates, 1) credentials with changes that users "just know about" and 2) credentials that change so often updates are always going to exist, seems overly reductive. I believe a majority of credentials wont fall into either category and users will have an expectation that their wallet just maintains that their credentials are up to date and in order for this to not be an overly expensive function to perform there needs to be a way for a wallet to ask an issuer are there any updates without having to possible get a redundant copy of a credential that has no updates.

nemqe commented 6 months ago

One Possible Solution - Leave it to the wallet

I am of the opinion that Issuers cannot be trusted, and that a Wallet should in all cases ensure that the Issuer is returning correct credentials that can be used for presentation.

There could be cases where issuers have bugs in their systems that lead to credentials with wrong data. Simple case would be one where we receive a credential that was meant for somebody else, and another one would be that of a batch issuance where an Issuer had cross-contamination that lead to the Wallet receiving 8 identical credentials with correct data, and 2 identical credentials with incorrect data.

I am also afraid of cases where Issuers might act in bad faith and issue credentials with bad data, which in a batch issuance case might translate to issuing non identical credentials under the same version identifier.

This is why I think the Wallet cannot really avoid doing some of the logic described in this section. But this is not entirely related to the ability to group credentials which might be bigger focus here.

Another possible solution - Issuer manages the credential version (Preferred Solution)

Instead of the wallet having to determine whether two copies of a credential are of “the same version”.

1. Changes to credential endpoint response - Add the ability for issuers to return a version identifier in the credential response from the credential endpoint.

Seems to me that with this approach the logic to normalize and compare credentials is moved to the Issuer, as it now needs to know if a credential is being re-issued or if data in the credential has changed so that an appropriate version_id can be re-used or generated.

2. New credential updates endpoint - Add a new “.../credential/updates” endpoint for issuers to implement that enables a wallet to query an issuer to check if the version of credential it has is the latest.

One thing that is unclear to me here is the definition of latest.

Are we talking about the version of the credential that IS issued, or that COULD be issued? And if it is the COULD variant, the question I have is compared to what? Would there be expected semantic if for example I have IDCardV1 and IDCardV2 that are both valid documents.

David-Chadwick commented 6 months ago

I believe a majority of credentials wont fall into either category

Can you give me lots of examples please. You obviously must have them if they are in the majority.

tlodderstedt commented 6 months ago

I'm missing another option to cope with the multiple credential instances option, which is the explicit differentiation between the credential data set (the data the issuer maintains) and the credential instances (different credentials based on the same data set with different device binding keys and so on).

See #91 and #155 for a discussion of the concepts. I'm in favor to evolve VCI in that direction for managing credential instances.

The proposal made in this issue could (on top or on its own) be used to manage updates for credential instances.

babisRoutis commented 6 months ago

I'm missing another option to cope with the multiple credential instances option, which is the explicit differentiation between the credential data set (the data the issuer maintains) and the credential instances (different credentials based on the same data set with different device binding keys and so on).

See #91 and #155 for a discussion of the concepts. I'm in favor to evolve VCI in that direction for managing credential instances.

The proposal made in this issue could (on top or on its own) be used to manage updates for credential instances.

Although VCI provides a solid definition of a credential type (credential_configuration_id) I think that the term credential instance can have multiple interpretations.

Two credential instances share the same credential_configuration_id and are being issued to the same holder, that's the easy part. We can have multiple such instances because those instance:

Out of these three options, to my understanding VCI

Perhaps, a new endpoint provided by the issuer could have a broader use:

David-Chadwick commented 6 months ago

I believe that case 3 should be covered by the credential schema. The schema describes the exact contents, whereas the type does not. Multiple credentials of the same type can have different schemas. If you remember I raised this as an issue over a year ago.

babisRoutis commented 6 months ago

I believe that case 3 should be covered by the credential schema. The schema describes the exact contents, whereas the type does not. Multiple credentials of the same type can have different schemas. If you remember I raised this as an issue over a year ago.

Perhaps I cannot express correctly the case due to my English. Apologies.

In the context of #279 I tried to provide some examples. There are cases where there is a single credential_configuration_id, that is same format, same credential type same set of possible credential attributes and hence the same underlying schema.

Let me provide you another example: Let's assume that there is a VC representing that I am an owner of a bank account (single ΙΒΑΝ). VCI doesn't cover the case that of having an offer for my bank accounts. Offering just a bank account owner credential_configuration_id is not enough, because then the wallet doesn't know how many issuance requests needs to place.

For me this is an example of having credential instances representing the same credential_configuration_id the same holder but different claim values (difference is in the IBAN).

Changing the VC to represent a list of claims (list of IBAN in the example) is not always desirable

David-Chadwick commented 6 months ago

So case 3 should be described as "contain the same claim types but different claim values". Is that correct? This could also apply to different credit cards (e.g. type=CC, brand=Visa or Mastercard, number=)

Then case 4 would become, "contain different claims"

babisRoutis commented 6 months ago

So case 3 should be described as "contain the same claim types but different claim values". Is that correct? This could also apply to different credit cards (e.g. type=CC, brand=Visa or Mastercard, number=)

Then case 4 would become, "contain different claims"

Yes David.

tlodderstedt commented 6 months ago

What is basically missing as of now is a differentiation between the data an issuer has about a user and the different credential instances (with different device keys) with this data. The user needs to understand the first concept (call credential data set in PR #115) and the credential instances. The credential data set is the level the user consent operates on, the instances are just a technical construct for unlinkability.

So the hierarchie in my mind is "credential schema/type" -> "credential configuration" -> "credential data set" -> "credential instance".

tplooker commented 6 months ago

Can you give me lots of examples please. You obviously must have them if they are in the majority.

To be honest I'm struggling to think of any examples which I believe fit your categories. The examples you gave "change of name, employer, role etc" I think its highly unlikely the user will "just know" when the credential update is available. Most name changes for example require an administrative process where the user has no idea when it will complete, and instead are notified by the issuer when this update is processed. The check updates endpoint I propose is just one such way to actually do this inside the protocol. I think leaving this out of band and assuming the user will "just magically know" when an update is available is limiting and leads to an un helpful UX.

tplooker commented 6 months ago

I'm missing another option to cope with the multiple credential instances option, which is the explicit differentiation between the credential data set (the data the issuer maintains) and the credential instances (different credentials based on the same data set with different device binding keys and so on).

I fully agree this data set distinction is missing and likely requires a seperate identifier.

tplooker commented 6 months ago

Out of these three options, to my understanding VCI

@babisRoutis I agree with your assessment here and I think your case 3 is related to what Torsten is proposing a solution for with by introducing the notion of a credential data set.

tplooker commented 6 months ago

I believe that case 3 should be covered by the credential schema. The schema describes the exact contents, whereas the type does not. Multiple credentials of the same type can have different schemas. If you remember I raised this as an issue over a year ago.

I disagree, I think case 3 is talking about saying issuing a birth certificate for two different people, same claim names different subjects and therefore claim values. In this case the schema is identical because schemas generally describe the shape of data not its precise contents.

jogu commented 6 months ago

There's a divergence arising because we need to be more precise in the language; "claims" is ambiguous, we need to be clear if we mean claim names or claim values - hence I believe the cases are (switching to letters to avoid any confusion with the numbers already in use above):

A. Contain the same claims names & values but different device binding keys or different issued at values/other metadata B. Contain same claims names but different claim values because a claim value has changed (e.g. name change due to getting married, or frequent flyer status level has changed, or driving entitlements have been removed/added) C. Contain same claims names but different claim values because the credential is about a different person/thing (e.g. a parent holding birth certificates for a parent & son, or someone holding multiple degrees from the same university) D. Contain different claims names because they using different credential schemas (currently primarily handled by using different credential_configuration_id)

I believe any changes for "D" is out of scope for this issue. (I'm not making any judgement on the value of making changes there, but we should keep that discussion separate.)

David-Chadwick commented 6 months ago

To be honest I'm struggling to think of any examples which I believe fit your categories.

I gave you one: a credential containing bank account details that included the current balance.

The examples you gave "change of name, employer, role etc" I think its highly unlikely the user will "just know" when the credential update is available. Most name changes for example require an administrative process where the user has no idea when it will complete, and instead are notified by the issuer when this update is process. The check updates endpoint I propose is just one such way to actually do this inside the protocol. I think leaving this out of band and assuming the user will "just magically know" when an update is available is limiting and leads to an un helpful UX

Polling by the wallet is not a sensible way to check when an administrative process has finished. It's much better that the administrator notifies the user out of band when this has been completed, so that the wallet can fetch the revised credential. So the new endpoint is not needed.

David-Chadwick commented 6 months ago

Unless of course you can provide at least one example of a credential that fits your category of "the majority of credentials" that really need this new endpoint.

jogu commented 6 months ago

It might be sensible to treat the 'New credential updates endpoint' part as a separate issue that we tackle after we've solved versioning in general.

(I think we discussed in a call somewhere that some issuers won't be happy with having an oauth protected API that's frequently called by wallets, and there are other possible mechanisms that might be more scalable like using a bit in a status list to indicate than an update is available.)

paulbastian commented 6 months ago

There's a divergence arising because we need to be more precise in the language; "claims" is ambiguous, we need to be clear if we mean claim names or claim values - hence I believe the cases are (switching to letters to avoid any confusion with the numbers already in use above):

A. Contain the same claims names & values but different device binding keys or different issued at values/other metadata B. Contain same claims names but different claim values because a claim value has changed (e.g. name change due to getting married, or frequent flyer status level has changed, or driving entitlements have been removed/added) C. Contain same claims names but different claim values because the credential is about a different person/thing (e.g. a parent holding birth certificates for a parent & son, or someone holding multiple degrees from the same university) D. Contain different claims names because they using different credential schemas (currently primarily handled by using different credential_configuration_id)

I believe any changes for "D" is out of scope for this issue. (I'm not making any judgement on the value of making changes there, but we should keep that discussion separate.)

I would not differentiate between B and C, or at least I think there is not enough value to justify the efforts. Having the distinction between the remaining three levels is what is described in issue #91.

tplooker commented 6 months ago

I gave you one: a credential containing bank account details that included the current balance.

Right I believe the logic here is also flawed, you are assuming all bank accounts have significant enough transactional activity, that every single time a wallet goes to fetch an update, which could be multiple times a day, that something has changed. Most people, have multiple bank accounts, many of which such as savings accounts or a backup credit card with another bank, see no transactions for days potentially weeks on end.

What I'm trying to say in general is making the case that credentials are either always updating OR that some out of band channel will always exist to communicate that an update is available, is simply not a reliable assumption.

Polling by the wallet is not a sensible way to check when an administrative process has finished. It's much better that the administrator notifies the user out of band when this has been completed, so that the wallet can fetch the revised credential. So the new endpoint is not needed.

Apps commonly check with a server about whether any state has changed that they should know about, how is this any different? This often happens on app startup, when the user foregrounds the app after a period of inactivity or when the user performs some form of action in the wallet, such as pull down to refresh on a home screen where all the credentials are displayed. There is also nothing stopping us from adding a push notification channel in the future to signal to a wallet when an update is available, making this check even more efficient.

Unless of course you can provide at least one example of a credential that fits your category of "the majority of credentials" that really need this new endpoint.

What I'm saying is I believe all credentials, including the ones you gave as examples actually fit better into the updates model I proposed.

tplooker commented 6 months ago

It might be sensible to treat the 'New credential updates endpoint' part as a separate issue that we tackle after we've solved versioning in general.

I agree, although I feel it was important to raise these concepts together as they have a strong relationship to one and other, I'm ok with splitting the issue now that we have had the conversation.

(I think we discussed in a call somewhere that some issuers won't be happy with having an oauth protected API that's frequently called by wallets, and there are other possible mechanisms that might be more scalable like using a bit in a status list to indicate than an update is available.)

I think the issue is here, is even if we dont support this check updates endpoint, the credential endpoint is still also a protected API which wallets could elect to frequently call. I'm of the opinion the issuer would be able to make the check updates endpoint highly scalable as they could be using a status list under the hood to answer queries about a credentials current version. I dont think we should opt for a status list style checking mechanism for versions as it is more complex for wallets and the herd privacy benefits it provides don't offer anything in this case.

tplooker commented 6 months ago

I would not differentiate between B and C, or at least I think there is not enough value to justify the efforts. Having the distinction between the remaining three levels is what is described in issue https://github.com/openid/OpenID4VCI/issues/91.

I disagree, not distinguishing makes delegation style usecases or any use case where holder != subject impossible.

jogu commented 6 months ago

@paulbastian

I would not differentiate between B and C, or at least I think there is not enough value to justify the efforts. Having the distinction between the remaining three levels is what is described in issue #91.

If we take "B" as the case where a credential refresh with existing tokens results in a credential where (say) the user's name has changed due to getting married, then we already differentiate between B & C, as in C the credentials have different credential_identifiers.

We don't currently have a clear differentiation between 'A' and 'B', I think that's primarily what this issue is about.

jogu commented 6 months ago

@tplooker

(I think we discussed in a call somewhere that some issuers won't be happy with having an oauth protected API that's frequently called by wallets, and there are other possible mechanisms that might be more scalable like using a bit in a status list to indicate than an update is available.)

I think the issue is here, is even if we dont support this check updates endpoint, the credential endpoint is still also a protected API which wallets could elect to frequently call. I'm of the opinion the issuer would be able to make the check updates endpoint highly scalable as they could be using a status list under the hood to answer queries about a credentials current version. I dont think we should opt for a status list style checking mechanism for versions as it is more complex for wallets and the herd privacy benefits it provides don't offer anything in this case.

We probably shouldn't have this full conversation on this issue, but I would observe that there's still big differences between an OAuth protected endpoint and a static file that can be hosted on a CDN, and I think some wallets may check revocation status in status lists anyway. It may be that we need multiple options.

babisRoutis commented 6 months ago

Is W3C VC Refresh relevant to the version support?

tplooker commented 5 months ago

Is W3C VC Refresh relevant to the version support?

IMO not really, its quite a different approach to credential refresh which requires presenting back an issued credential to obtain a new refreshed copy, AFAIK it doesn't deal with version management either.

tplooker commented 5 months ago

If we take "B" as the case where a credential refresh with existing tokens results in a credential where (say) the user's name has changed due to getting married, then we already differentiate between B & C, as in C the credentials have different credential_identifiers.

Im not sure we do adequately distinguish between B & C currently, if we look at the current definition for credential_identifiers

OPTIONAL. Array of strings, each uniquely identifying a Credential that can be issued using the Access Token returned in this response. Each of these Credentials corresponds to the same entry in the credential_configurations_supported Credential Issuer metadata but can contain different claim values or a different subset of claims within the claims set identified by that Credential type. This parameter can be used to simplify the Credential Request, as defined in Section 7.2, where the credential_identifier parameter replaces the format parameter and any other Credential format-specific parameters in the Credential Request. When received, the Wallet MUST use these values together with an Access Token in subsequent Credential Requests.

In short we don't make it clear that a different credential identifier uniquely identifies a different credential subject it may but also may not.

IMO we either need a credential_subject_id style identifier to accompany a credential OR we modify the definition for a credential_identifier (I know that would be a broader change with other impacts).

tplooker commented 5 months ago

We probably shouldn't have this full conversation on this issue, but I would observe that there's still big differences between an OAuth protected endpoint and a static file that can be hosted on a CDN, and I think some wallets may check revocation status in status lists anyway. It may be that we need multiple options.

I agree its probably a wider conversation, but I would note that the check updates endpoint design as I've proposed could utilise a CDN too, to cache responses to requests where things haven't changed. This would make it as a solution scalable to a similar degree without requiring wallets to process a status list containing thousands if not millions of entires pertaining to credential version changes which are entirely irrelevant to it.