eu-digital-identity-wallet / eudi-doc-architecture-and-reference-framework

The European Digital Identity Wallet
https://eu-digital-identity-wallet.github.io/eudi-doc-architecture-and-reference-framework/
Other
369 stars 55 forks source link

Adapt PID rulebook to latest SD-JWT VC #160

Open danielfett opened 3 months ago

danielfett commented 3 months ago

The current description of the usage of SD-JWT in the PID rulebook is not sufficient to result in PID attestations that are compatible to existing conventions and standards like JWTs and SD-JWT VC in particular. The changes proposed here adapt the PID rulebook to be compatible with SD-JWT VC and to work in the direction proposed in SD-JWT VC DM.

The goal is also to make the best usage of SD-JWT VC Type Metadata, even though not all parts of that have made it into SD-JWT VC yet. This allows for automated discovery and processing of type metadata, like schemas (e.g., for relying parties) and rendering information

joelposti commented 1 month ago

The proposed claim set does not seem to include a claim that would specify the namespace/credential type eu.europa.ec.eudi.pid.1 in an issuer-agnostic way. Lack of such a claim makes it difficult for an RP to make a request where they say that they want a presentation of certain claims from a PID of any member state.

Consider the following presentation definition with which an RP requests given_name from a PID of any member state. The following request will not work without a credential_type or similar claim.

{
  "id": "foobar",
  "vp_formats": {
    "vc+sd-jwt": {
      "sd-jwt_alg_values": [
        "ES256"
      ],
      "kb-jwt_alg_values": [
        "ES256"
      ]
    }
  },
  "input_descriptors": [
    {
      "id": "given_name",
      "constraints": {
        "fields": [
          {
            "path": [
              "$.credential_type"
            ],
            "filter": {
              "const": "eu.europa.ec.eudi.pid.1"
            }
          },
          {
            "path": [
              "$.given_name"
            ],
            "filter": {
              "type": "string"
            }
          }
        ]
      }
    }
  ]
}

Also, the claim set does not include issuing_country which prevents an RP from requesting a PID from a specific member state. Vct might not work for this purpose because a PID Provider might need to add a version number to the vct value. Consider the following scenario:

  1. A PID Provider issues PIDs with vct value https://memberstate.example/credential/pid and vct#integrity claim having the SHA-384 digest of the metadata document.
  2. The metadata document contais a typo in one claims[].display['de-DE'].label field. The PID Provider cannot fix this typo by modifying the metadata document because then the SHA-384 digest of the metadata document would change breaking the metadata integrity for all PIDs the Provider has so far issued.
  3. To fix the typo the PID Provider needs to create a new metadata document and needs to start issuing PIDs with a new different vct value https://memberstate.example/credential/pid/1.1 and a new vct#integrity value.
  4. The new PIDs issued in step 3 are not accepted by RPs because in their presentation definitions they select PIDs having vct value https://memberstate.example/credential/pid specifically.

If the RPs instead selected using credential_type and issuing_country, which do not change between metadata document versions, the old and new PIDs would both be accepted by RPs in step 4. Below is a presentation definition with which an RP requests given_name from a PID of a specific member state. The request will not work without credential_type and issuing_country claims.

{
  "id": "foobar",
  "vp_formats": {
    "vc+sd-jwt": {
      "sd-jwt_alg_values": [
        "ES256"
      ],
      "kb-jwt_alg_values": [
        "ES256"
      ]
    }
  },
  "input_descriptors": [
    {
      "id": "given_name",
      "constraints": {
        "fields": [
          {
            "path": [
              "$.credential_type"
            ],
            "filter": {
              "const": "eu.europa.ec.eudi.pid.1"
            }
          },
          {
            "path": [
              "$.issuing_country"
            ],
            "filter": {
              "const": "DE"
            }
          },
          {
            "path": [
              "$.given_name"
            ],
            "filter": {
              "type": "string"
            }
          }
        ]
      }
    }
  ]
}
danielfett commented 2 weeks ago

@joelposti You're raising a couple of very good points here!

The proposed claim set does not seem to include a claim that would specify the namespace/credential type eu.europa.ec.eudi.pid.1 in an issuer-agnostic way. Lack of such a claim makes it difficult for an RP to make a request where they say that they want a presentation of certain claims from a PID of any member state.

Consider the following presentation definition with which an RP requests given_name from a PID of any member state. The following request will not work without a credential_type or similar claim. (...)

Just as for ISO formats, the query language needs to accommodate for specifics of the credential format. Here, the logic is that a query for a base type may result in a response containing a credential of a type that extends the base type. Therefore, the query can ask for vct == urn:eu.europa.ec.eudi:pid:1 and may receive a credential with the vct https://example.bmi.bund.de/credential/pid/1.0.

Also, the claim set does not include issuing_country which prevents an RP from requesting a PID from a specific member state.

This was by mistake and I added issuing_country to the example.

Vct might not work for this purpose because a PID Provider might need to add a version number to the vct value. Consider the following scenario:

This is another good point. I added the following updated text to the PR in order to address the problem of versioning using vct:

Domestic PID types for national attributes SHALL be defined using URLs
and extend the EU-wide PID type. It is RECOMMENDED to implement a
national base type and an extension for each version of the type. More
than one domestic PID type MAY be defined per Member State. Domestic PID
types SHALL specify in their Type Metadata any additional fields/claims
and MAY define display information.

EXAMPLE: For Germany, two Verifiable Credential Types for PIDs could be
defined initially:

 * `https://example.bmi.bund.de/credential/pid/` as the national base
   type, where in the metadata of the type, the `extends` field would
   reference the EU-wide type `urn:eu.europa.ec.eudi:pid:1`. This base
   type would not define schema or display information, as these are
   defined in the concrete versions of the type.
 * `https://example.bmi.bund.de/credential/pid/1.0` as the first version
   of the national credential type, defining in its metadata schema and
   display information. The `extends` field would reference the base
   type `https://example.bmi.bund.de/credential/pid/`.

I think we should avoid trying to manage a non-native designator for the credential type (here, the ISO namespace) together with a native one (here, the SD-JWT VCT). This will lead to confusion and various problems.

joelposti commented 2 weeks ago

Thank you very much @danielfett for your replies to my comments and for the changes you made!

The proposed claim set does not seem to include a claim that would specify the namespace/credential type eu.europa.ec.eudi.pid.1 in an issuer-agnostic way. Lack of such a claim makes it difficult for an RP to make a request where they say that they want a presentation of certain claims from a PID of any member state. Consider the following presentation definition with which an RP requests given_name from a PID of any member state. The following request will not work without a credential_type or similar claim. (...)

Just as for ISO formats, the query language needs to accommodate for specifics of the credential format. Here, the logic is that a query for a base type may result in a response containing a credential of a type that extends the base type. Therefore, the query can ask for vct == urn:eu.europa.ec.eudi:pid:1 and may receive a credential with the vct https://example.bmi.bund.de/credential/pid/1.0.

So essentially vct: urn:eu.europa.ec.eudi:pid:1 in the presentation definition would map to vct: https://example.bmi.bund.de/credential/pid/1.0 in the credential. I would like to contest that idea: that mapping is custom domain specific logic. It deviates from Presentation Exchange and JSON Schema specifications. Presentation Exchange 2.0.0 says in section 5.1.1 Input Descriptor Object that filter field in an input descriptor should be a JSON Schema descriptor:

The fields object MAY contain a filter property, and if present its value MUST be a JSON Schema descriptor used to filter against the values returned from evaluation of the JSONPath string expressions in the path array.

The wallet and the RP would not be able to rely on existing JSON Schema implementations because "filter": { "const": "urn:eu.europa.ec.eudi:pid:1" } would not match by string comparison with any claim in the credential. Custom mapping logic would be required whose inclusion in my opinion needlessly complicates the implementation of presentation definition and filter evaluation. Evaluation of both is already complex. I don't think it is advised to make it more complex with additional custom logic.

Such custom mapping logic could require that each party, that evaluates presentation definitions¹, would need to maintain or download from somewhere a list of vct values that map to urn:eu.europa.ec.eudi:pid:1. Keeping such a list updated quickly becomes untenable, if PID Providers keep changing their vct values or the version numbers in them.

An alternative vct value mapping solution, which I think the SD-JWT VC specification is suggesting, would be that the evaluating party downloads a chain of extending metadata documents until they finally arrive to the one whose data can be used to determine that https://example.bmi.bund.de/credential/pid/1.0 == urn:eu.europa.ec.eudi:pid:1. The performance of synchronously downloading multiple metadata documents from different places would likely be pretty bad. Caching the documents partially solves the performance problem but greatly increases the complexity even further.²

Either way custom mapping logic of vct values makes it difficult to determine whether the credential's type matches with eu.europa.ec.eudi.pid.1. That check is one of the basic checks the evaluating party does every time. It should be easy and quick.

¹ The evaluating party could be a wallet (in order to respond to the RP's request) or an RP (in order to validate that what they received from the wallet is what they requested).

² Downloading chains of metadata documents also raises the question how the authenticity and authoritativeness of these metadata documents are established. Digest checks at each step of the way?

c2bo commented 2 weeks ago

Either way custom mapping logic of vct values makes it difficult to determine whether the credential's type matches with eu.europa.ec.eudi.pid.1. That check is one of the basic checks the evaluating party does every time. It should be easy and quick.

We will likely have custom attributes in similar credentials - even in the PID if i understand correctly. Imho, it is important to convey that information clearly - what type of PID the presented credential exactly is. Given that mental model, some kind of type hierarchy/derivation seems to be necessary. I do believe it would be dangerous to ignore that aspect and it might also limit the usability of the technology outside of the initial use-cases/scope. That being said, I also agree with you that we need to make sure this is still reasonable easy to implement.

An alternative vct value mapping solution, which I think the SD-JWT VC specification is suggesting, would be that the evaluating party downloads a chain of extending metadata documents until they finally arrive to the one whose data can be used to determine that https://example.bmi.bund.de/credential/pid/1.0 == urn:eu.europa.ec.eudi:pid:1. The performance of synchronously downloading multiple metadata documents from different places would likely be pretty bad. Caching the documents partially solves the performance problem but greatly increases the complexity even further.²

If the vct chain is critical to the functionality of the credential (as it would be in the case of the PID), it could be included in the presentation that is transmitted. The sd-jwt-vc draft proposes to use the unprotected header to optionally transport that kind of information (https://drafts.oauth.net/oauth-sd-jwt-vc/draft-ietf-oauth-sd-jwt-vc.html#name-from-type-metadata-glue-doc). That way for credentials like the PID, the wallet could directly convey that information.

² Downloading chains of metadata documents also raises the question how the authenticity and authoritativeness of these metadata documents are established. Digest checks at each step of the way?

That would be my current understanding of the mechanism and the proposed integrity checks.

joelposti commented 2 weeks ago

Either way custom mapping logic of vct values makes it difficult to determine whether the credential's type matches with eu.europa.ec.eudi.pid.1. That check is one of the basic checks the evaluating party does every time. It should be easy and quick.

We will likely have custom attributes in similar credentials - even in the PID if i understand correctly. Imho, it is important to convey that information clearly - what type of PID the presented credential exactly is. Given that mental model, some kind of type hierarchy/derivation seems to be necessary. I do believe it would be dangerous to ignore that aspect and it might also limit the usability of the technology outside of the initial use-cases/scope. That being said, I also agree with you that we need to make sure this is still reasonable easy to implement.

I understand your points and I do agree that there needs to be separate issuer and member state specific metadata documents that describe the custom nuances of each issuer's PID. I am not arguing against the vct claim or the metadata documents. I am trying to say that there should a separate issuer-agnostic credential_type claim in addition to a vct claim. As an evaluating party I want to quickly understand what is the (coarse) type of the presented credential before delving into deeper metadata document-based analysis.

An alternative vct value mapping solution, which I think the SD-JWT VC specification is suggesting, would be that the evaluating party downloads a chain of extending metadata documents until they finally arrive to the one whose data can be used to determine that https://example.bmi.bund.de/credential/pid/1.0 == urn:eu.europa.ec.eudi:pid:1. The performance of synchronously downloading multiple metadata documents from different places would likely be pretty bad. Caching the documents partially solves the performance problem but greatly increases the complexity even further.²

If the vct chain is critical to the functionality of the credential (as it would be in the case of the PID), it could be included in the presentation that is transmitted. The sd-jwt-vc draft proposes to use the unprotected header to optionally transport that kind of information (https://drafts.oauth.net/oauth-sd-jwt-vc/draft-ietf-oauth-sd-jwt-vc.html#name-from-type-metadata-glue-doc). That way for credentials like the PID, the wallet could directly convey that information.

Including the metadata documents in the presentation sounds like a reasonable solution. It would solve the performance and caching issues I mentioned.

c2bo commented 2 weeks ago

I understand your points and I do agree that there needs to be separate issuer and member state specific metadata documents that describe the custom nuances of each issuer's PID. I am not arguing against the vct claim or the metadata documents. I am trying to say that there should a separate issuer-agnostic credential_type claim in addition to a vct claim. As an evaluating party I want to quickly understand what is the (coarse) type of the presented credential before delving into deeper metadata document-based analysis.

Would it be fine to add this also into the unsigned part? Basically something like a credential type "root" that can easily be processed and later on checked for correctness. That way we could initially easily check if the answer is of the expected type (e.g., EU PID) and then properly do the integrity checks + validate if that PID type is really part of the vct chain.

joelposti commented 2 weeks ago

I understand your points and I do agree that there needs to be separate issuer and member state specific metadata documents that describe the custom nuances of each issuer's PID. I am not arguing against the vct claim or the metadata documents. I am trying to say that there should a separate issuer-agnostic credential_type claim in addition to a vct claim. As an evaluating party I want to quickly understand what is the (coarse) type of the presented credential before delving into deeper metadata document-based analysis.

Would it be fine to add this also into the unsigned part? Basically something like a credential type "root" that can easily be processed and later on checked for correctness. That way we could initially easily check if the answer is of the expected type (e.g., EU PID) and then properly do the integrity checks + validate if that PID type is really part of the vct chain.

Why in the unsigned part specifically?

I was thinking credential_type would be a claim just like any other in the signed JWS Payload. I think that would make it the simplest to use. It would also be compatible with presentation definition and filter evaluation.

awoie commented 2 weeks ago

To @danielfett's point, a similar approach to ISO mdocs could be defined for the vct value in SD-JWT VC.

ISO TS 18013-7 (latest draft) defines a profile of presentation exchanges that describes how to use presentation definition objects with ISO mdocs. Because ISO mdocs are CBOR-based, JSONPath makes no sense. For that reason, a profile definition was needed. For example, the ISO mdoc doctype has to match the id of the input descriptor, only constraints and fields/path are used to match data element values for a data element identifier in a namespace, etc. To use ISO mdocs with presentation exchange, extra rules were defined by the profile in ISO TS 18013-7 (CD).

Example:

{
  "input_descriptors": [
    {
      "id": "org.iso.18013.5.1.mDL",
      "format": {
        "mso_mdoc": {}
      },
      "constraints": {
        "fields": [
          {
            "path": [
              "$['org.iso.18013.5.1']['birth_date']"
            ],
            "intent_to_retain": false
          }
        ]
      }
    }
  ]
}

IMO, another thing I wanted to add is that it is reasonable to assume the wallet itself will always have the full type metadata chain locally cached. I assume the same for relying parties. I don't see a caching issue there.

joelposti commented 2 weeks ago

IMO, another thing I wanted to add is that it is reasonable to assume the wallet itself will always have the full type metadata chain locally cached. I assume the same for relying parties. I don't see a caching issue there.

It is reasonable to assume that the wallet has the full metadata chain locally cached.

However, it's different for the RP. There are at least 27 PID providers. This could mean that an RP has to cache 27 metadata chains + however many active versions there are of each chain. It's not huge but it's not nothing either.

c2bo commented 2 weeks ago

I understand your points and I do agree that there needs to be separate issuer and member state specific metadata documents that describe the custom nuances of each issuer's PID. I am not arguing against the vct claim or the metadata documents. I am trying to say that there should a separate issuer-agnostic credential_type claim in addition to a vct claim. As an evaluating party I want to quickly understand what is the (coarse) type of the presented credential before delving into deeper metadata document-based analysis.

Would it be fine to add this also into the unsigned part? Basically something like a credential type "root" that can easily be processed and later on checked for correctness. That way we could initially easily check if the answer is of the expected type (e.g., EU PID) and then properly do the integrity checks + validate if that PID type is really part of the vct chain.

Why in the unsigned part specifically?

I was thinking credential_type would be a claim just like any other in the signed JWS Payload. I think that would make it the simplest to use. It would also be compatible with presentation definition and filter evaluation.

That would only work for 1 level of type dependency - if you have 3 or more levels, you wouldn't know what to put in there. Let me try to construct a concrete example - let's assume that we have something like this for credential types: Worldwide PID -> EU PID -> German PID

A Relying Party could request the EU PID or the worldwide PID and depending on the request, you would expect different values in credential_type to allow for an easier matching. The way I see it, there should be something that makes this a bit easier on the RP, but it should basically be linking the presentation to the request. In that way I would say this is more something that should be dealt with on Request/Response side than directly in the signed parts of the credential.

IMO, another thing I wanted to add is that it is reasonable to assume the wallet itself will always have the full type metadata chain locally cached. I assume the same for relying parties. I don't see a caching issue there.

It is reasonable to assume that the wallet has the full metadata chain locally cached.

However, it's different for the RP. There are at least 27 PID providers. This could mean that an RP has to cache 27 metadata chains + however many active versions there are of each chain. It's not huge but it's not nothing either.

Agreed, especially for cases where we expect a somewhat large amount of different Issuers with different credential types but a common schema, the wallet should IMHO be able to optionally provide this information.

joelposti commented 1 week ago

I understand your points and I do agree that there needs to be separate issuer and member state specific metadata documents that describe the custom nuances of each issuer's PID. I am not arguing against the vct claim or the metadata documents. I am trying to say that there should a separate issuer-agnostic credential_type claim in addition to a vct claim. As an evaluating party I want to quickly understand what is the (coarse) type of the presented credential before delving into deeper metadata document-based analysis.

Would it be fine to add this also into the unsigned part? Basically something like a credential type "root" that can easily be processed and later on checked for correctness. That way we could initially easily check if the answer is of the expected type (e.g., EU PID) and then properly do the integrity checks + validate if that PID type is really part of the vct chain.

Why in the unsigned part specifically? I was thinking credential_type would be a claim just like any other in the signed JWS Payload. I think that would make it the simplest to use. It would also be compatible with presentation definition and filter evaluation.

That would only work for 1 level of type dependency - if you have 3 or more levels, you wouldn't know what to put in there. Let me try to construct a concrete example - let's assume that we have something like this for credential types: Worldwide PID -> EU PID -> German PID

A Relying Party could request the EU PID or the worldwide PID and depending on the request, you would expect different values in credential_type to allow for an easier matching. The way I see it, there should be something that makes this a bit easier on the RP, but it should basically be linking the presentation to the request. In that way I would say this is more something that should be dealt with on Request/Response side than directly in the signed parts of the credential.

Hmm. I did not know we needed to consider worldwide PIDs. I thought EU PID was the root namespace. Worldwide PID namespace certainly makes credential_type case trickier.

Just as an idea, I have not yet thought this through: credential_type claim could be an array having all the hierarchical identifiers in it:

{
  "credential_type": [
    "<worldwide PID identifier>",
    "eu.europa.ec.eudi.pid.1", // EU PID identifier
    "<German PID identifier>"
  ]
}

An array of hierachical identifiers would make the PID selectable using any of the identifiers in the array. For that we can use JSON Schema's contains keyword. Below is an example presentation definition for requesting given_name attribute from an EU PID of any issuer:

{
  ...
  "input_descriptors": [
    {
      "id": "given_name",
      "constraints": {
        "fields": [
          {
            "path": [
              "$.credential_type"
            ],
            "filter": {
              "type": "array",
              "contains": {
                "const": "eu.europa.ec.eudi.pid.1"
              }
            }
          },
          {
            "path": [
              "$.given_name"
            ],
            "filter": {
              "type": "string"
            }
          }
        ]
      }
    }
  ]
}
c2bo commented 1 week ago

Hmm. I did not know we needed to consider worldwide PIDs. I thought EU PID was the root namespace. Worldwide PID namespace certainly makes credential_type case trickier.

I don't think we need to consider worldwide PIDs specifically for the time being, but we should try to design a system (if possible without too much added complexity) that can deal with those kinds of problems. A big part of the adoption of this technology will IMHO be the applicability to other (especially unregulated) use-cases and this kind of type hierarchy will be present in a lot of them.

Providing something like the array of types could definitely work, but I am still not sure that adding this as redundant signed data is the path to go. I would personally prefer to solve this in the query language & credential format specific processing.

ssanchocanela commented 2 hours ago

Dear all. Following the initial commitment to continue working towards SD-JWT VC DM, we approve this document. Thanks for the effort. We will continue providing feedback on minor editorial changes if needed.