batch issuing SD-JWTs without duplicating plain text claim values....?

oauth-wg / oauth-selective-disclosure-jwt

https://datatracker.ietf.org/doc/draft-ietf-oauth-selective-disclosure-jwt/

Other

56 stars 29 forks source link

batch issuing SD-JWTs without duplicating plain text claim values....? #329

Closed Sakurann closed 6 months ago

Sakurann commented 1 year ago

the issue comes from mso_mdoc world. the issued credential contains claim values huge in size, like portrait images, so when the issuer tries to issue a lot of the copies of the same credential to enable verifier unlinkability, the issuer has to duplicate something big in size, also increasing the size of the credentials the wallet needs to store.

I think this is possible with SD-JWTs only if when the issuer will define a version of not-full "disclosures" that does not contain the plaintext claim value. so that when the wallet will send the full SD-JWT to the verifier, the wallet will have to reconstruct the full disclosures using the plaintext claim value it has received from elsewhere and the not-full disclosures...

related to this OID4VCI issue in Bitbucket: https://bitbucket.org/openid/connect/issues/2003/de-duplication-of-unsigned-data-when

alenhorvat commented 1 year ago

This is related to how the issued credentials are structured. E.g., if name, surname, photo are plaintext and you issue 10 sd-jwt, I guess the model could be:

{
"plain": {
  "name": "Alice",
  "surname": "Wonderland",
  "photo": "abc...123"
},
"hidden": [
  {
    here credentials contain the hidden values + json path to the plain text claim
  }, ...
]
}

The issue is that JSON is non-deterministic and it will be very hard to reproduce a deterministic structure you need to verify the signature without the use of JCS or approach introduced in JAdES where you have additional JSON object defining which elements are used to re-produce the signature.

Another option would be to split the VC in 2 part - plain + hidden and something like b64 plain+b64 hidden is signed over - or any other deterministic structure.

danielfett commented 1 year ago

I assume that the fields in the issued SD-JWTs that contain (for example) a portrait image should compress quite well when multiple SD-JWTs are compressed together in one data stream, as the base64 representation will often times be the same. (When the bytes don't align, there may be two or three different representations of the portrait image in the compressed data, but it should not be more than that. Issuers can even optimize for this.)

So both for issuance as well as for storing the credentials, compression may be a good option to reduce size.

Sakurann commented 1 year ago

I have been thinking about this a little, and I think the simplest way to do this with sd-jwt, without any structural changes is to define a string (“dehydrated” for example) that goes instead of a plaintext claim value in a dehydrated credential (it's an ISO term, it's basically a credential that does not contain full plain text value). we could add one paragraph on this in privacy considerations section where we talk about unlinkability....

danielfett commented 1 year ago

This would effectively be a highly specialized compression algorithm that needs to be specified and implemented specifically for this credential format. The readily available alternative is to just use gzip (or whatever your preferred compression algorithm is) which will not only do this for you, but also compress your other data.

bc-pi commented 11 months ago

related to this OID4VCI issue in Bitbucket: https://bitbucket.org/openid/connect/issues/2003/de-duplication-of-unsigned-data-when

which is now https://github.com/openid/OpenID4VCI/issues/54 after the migration of some of the OID4VC work to github

bc-pi commented 11 months ago

But I agree with @danielfett that spec'ing something here would likely "effectively be a highly specialized compression algorithm that needs to be specified and implemented specifically for this credential format" that would also need to define something across SD-JWT instances (where the draft currently defines just a single instance) to refer to the same plain text claim value and account for the integrity of that value and consider the linkability implications of how it's referenced etc. I think this is a potential optimization that SD-JWT should not attempt to tackle.

Sakurann commented 6 months ago

@cobward, is this still important to the issuance of mdocs and/or ISO/IEC 23220-3..?

cobward commented 6 months ago

Currently, 23220-3 uses a CBOR representation of the credential data with optionally included plain text claims in the credential response. No specific work is needed here to support that.