WICG / digital-credentials

Digital Credentials, like driver's licenses
https://wicg.github.io/digital-credentials/
Other
66 stars 8 forks source link

registry of credential types #85

Open npdoty opened 4 months ago

npdoty commented 4 months ago

raised by @Sakurann among others: should we define a registry of types of credentials, which may have different sorts of privacy implications and necessary protections?

It seems like a government-issued high assurance identity document is different from a grocery store loyalty card, or a discount coupon from a commercial establishment that is only redeemed in that same place.

Sakurann commented 4 months ago

thank you for opening the issue, Nick!

I think there might be two separate steps: first, understanding what kind of credential types implementers are interested in (related to the discussion in PR #82) to drive certain design choices and second discussing whether we need a registry for those.

OR13 commented 4 months ago

it would also be nice to have a more precise definition of "credential type" given there are multiple media types, and each media type has a distinct concept of "type".

OAUTH SD-JWT-VC has "vct". W3C JWT / SD-JWT / COSE has "type" (as in RDF type). mDoc has ???

aniltj commented 4 months ago

I would HIGHLY recommend thinking thru the reality that there are jurisdictions where "who is authoritative for what" is not well defined, and where they are (or will be).

It is also worthwhile to differentiate between an informative listing of types of credentials and associated "... privacy implications and necessary protections" which could be helpful to the ecosystem and ..

.. a registry of the types of credentials requiring ongoing care and feeding, and of more concern, gatekeepers who show up with the desire to be the map makers who define the boundaries of the known (credential ecosystem) world.

Sakurann commented 4 months ago

+1 to start with precise definition of "credential type" that is credential format agnostic (if such thing is possible). Hopefully, something along the lines of "A type is associated with rules defining which claims may or must appear in the Credential" (taken from a great document started by @danielfett: https://vcstuff.github.io/sd-jwt-vc-types/draft-fett-oauth-sd-jwt-vc-types.html).

and, at least my mind, credential type != media type, so I would prefer not to bring in media types in this discussion.

msporny commented 4 months ago

Just to head this off and make sure no one is proposing this:

Having a centralized registry for "types of credentials" is a Really Bad Idea (tm).

The W3C doesn't manage registries for types of websites, or types of movies, or types of music. W3C Members have known that it's not any of W3C's business to constrain innovation at that layer (and a centralized registry has a high likelihood of creating that harm).

The concept of using URLs to identify a type of credential seems like a good idea. At least that gives systems something unique to key off of into privacy, security, and business requirements. Documenting the types of credentials then becomes more of a distributed search/discovery endeavor instead of mandatory registration endeavor.

These centralized W3C registries probably make sense:

... but credential types? There could be thousands of them, with tens of thousands of claim names, and the solution for that probably looks more like schema.org than any W3C or IETF registry.

@npdoty, we might get more mileage out of identifying specific claim attributes (again, using some sort of universal identifier) and noting how sensitive each property is in various dimensions by understanding the harm it causes for attackers to know the information. For example an ssn field highly correlatable and can be used for identity theft, a birthday is less harmful (but still correlatable), a zipcode less so (but again, highly correlatable when you put it together w/ a birthday or a last name). It would probably be easier to mark properties that are not correlatable and presume that every other one is dangerous and correlatable... but then, for what purpose? Warn the individual whose about to share the information? Suggest that there might be a better way to share the claim?

IOW, what is the purpose of this registry?

msporny commented 4 months ago

I'm getting ready to rip out the list of credential types in PR https://github.com/WICG/digital-identities/pull/82, archiving them here for posterity:

OR13 commented 4 months ago

Agree that credential type != media type.

An issuer creates a "credential type", by picking a media type, and a set of mandatory and optional attributes, and gives the combination a consumer friendly name, like "vaccination card".

Regulations then target these "consumer friendly" names, and leave the technical details to experts with enough patience to figure out what that actually means.

Channeling what I imagine one of these future individuals might say:

I don't think it's a good idea to call a digital drivers license in sd-jwt and mDoc both "the same credential type", because the media types, privacy and security requirements and claims structures are all different.

But I also think this will quickly become pointless to debate, since consumers don't care about these technology details... They care about what their credentials let them do, who accepts them, and how much they cost to renew when you drop your phone in a lake.

Given that registries exist for media types and attributes, it seems that "credential types" are naturally dependent on those 2 existing registries types.

ISO mDoc has some oid like structure I don't understand and is application/cose by relying on COSE Sign1, unless there is a more specific media type registered. mDoc also relies on the IANA CBOR, and COSE registries.

OAuth SD-JWT has JWT claims registry and is naturally application/sd-jwt. SD-JWT also relies on the IANA JOSE registry.

Data Integrity Proof W3C VCs rely on JSON-LD for their registries, and are application/vc+ld+json. W3C VCs rely on the VC Data Model Vocabulary.

CWT based credentials types, rely on the CWT registry and are application/cose. CWTs rely on the CBOR and COSE registries.

If we choose to define the concept of "presentation types", multiply the above by 2.

I can't speak for mDoc, but the other 3 have distinct attributes and media types for presentations. Including media types for expressing encrypted content.

The most significant challenge surfaced by presentation types, is handling key binding for multiple credentials, possibly of differing "credential types"... Which quickly leads to very low interoperability... and very high frustration, and security analysis costs.

Centralizing "presentation protocols", seems to be inline with the "credential presentation types / protocols", that W3C has already standardized, such as requesting and responding with signatures from web authn authenticators.

Sakurann commented 4 months ago

@OR13 I don't think we are on the same page (if I understood you correctly)... media types can be used to differentiate different credential formats, while credential type represents what attributes are in the credential (and how credential type is expressed currently depends on the credential format). in this issue we should be talking about credential types as in how sensitive the attributes in the credential are, what entity attests that data, etc., and not "presentation media types".

OR13 commented 4 months ago

@Sakurann

The first part of what you wrote seems to be aligned with what I wrote.

If the APIs this group designs do not transport credentials to wallets, but do transport presentations of credentials from wallets to verifiers, a verifier only ever sees presentations (of credentials (of types))...

If a verifier is assigning the "type" then they are assigning that type to "presentations" not credentials.

If the issuer is assigning the "type", then we are contemplating delivering credentials to wallets, and perhaps the wallet is asked if it can protect EdDSA keys in hardware, and if it can, then the issuer might allow that wallet to prove possession of a key, before issuing a mobile drivers license and delivering it to the wallet.

It sounds like maybe this issue is about: which wallets have the security properties to secure vaccination credentials, not: how are vaccination credentials and their various types represented?

Id not recommend using "credential type" to get at that property.

Instead, I would define wallet assurance levels, and then map credential types to assurance levels.

If we have to design credential types in order to do that, are we going to limit that discussion to just vaccination cards and drivers licenses?

Perhaps browsers need to understand wallet assurance levels in order to decide if a credential can be stored or presented... Not what type of credential (or presentation) they are transporting.

Obviously a single predicate / attribute might be enough to infer the credential type, and the browser would then need to be trusted to not disclose that information.

I'd prefer a system where a verifier could encrypt the query to the wallet and the wallet could encrypt the presentation to the verifier, and worst thing browsers or network intermediaries could do, would be to drop the traffic... Such a system wouldn't need to have any understanding of credential type... But a verifier would.

npdoty commented 4 months ago

Apologies if my brief reference to "registry" in the issue title has thrown us off; I'm not committed (and I don't think others who were discussing it on calls) to a W3C Registry with consensus approval for a new type of credential/use case.

It does seem useful to some of the privacy/security discussions to understand the properties of different types of credentials. Those could be about the particular properties (as Manu noted, ssn has a different type of sensitivity), or the issuer type, or the assurance level, or whether it's issued in one place and verified in another.

It might be possible that a formal Registry would be useful just to list high-assurance-permanent-offline-identifiable vs. same-origin-temporary token, or some other list of capabilities, just so it can be consistently referred to elsewhere. And it might be helpful to have a non-exhaustive list of examples documented (as in #82 discussion) to help in classifying the privacy/security implications. But sounds like we don't need a registry for driver's-license, passport, email account, etc. because there may be many, hard-to-predict and don't want to be pre-constrained.

OR13 commented 1 month ago

Consider this hypothetical device response:

const document = await new Document('org.iso.18013.5.1.mDL')
      .addIssuerNameSpace('org.iso.18013.5.1', {
        family_name: 'Jones',
        given_name: 'Ava',
        birth_date: '2007-03-25',
      })
      .useDigestAlgorithm('SHA-256')
      .addValidityInfo({
        signed: new Date(),
      })
      .addDeviceKeyInfo({ deviceKey: publicKeyJWK })
      .sign({
        issuerPrivateKey,
        issuerCertificate,
        alg: 'ES256',
      });

Notice the identifiers org.iso.18013.5.1.mDL, org.iso.18013.5.1 ....

These are what allow for things like birth_date to have common understanding.

In JSON-LD Credentials, you could have predicates that are formed by the open worlds RDF triples which are created... for example:

subject predicate object
...

https://example.gov/credentials/123  https://www.w3.org/2018/credentials/#issuer https://example.gov 
https://example.gov/credentials/123 https://www.w3.org/2018/credentials/#credentialSubject did:example:123

By allow listing predicates, you constrain the effort it takes for a verifier to extract knowledge from a holder.

You can make it "more or less expensive" (where friction determines cost) to learn sensitive information.

As @samuelgoto mentioned on the call, if the predicate is not in the allow list, the friction should align to the worst case, which I will summarize as:

The verifier is requesting credentials this browser does not understand... are you sure you want to present them?

Another thing to consider is that RDF predicates are URLs, which can be to origins that comply with local internet laws, so the expressiveness of RDF credentials is constrained by the URLs and the vocabularies that are shared between issuers and verifiers.... Taking a few examples for the the VCDM v2 context:

https://w3c.github.io/vc-data-model/#example-usage-of-the-refreshservice-property-by-an-issuer

"@context": [
    "https://www.w3.org/ns/credentials/v2",
    "https://www.w3.org/ns/credentials/examples/v2",
    "https://w3id.org/vc-refresh-service/v1"
  ],

^ These URLs are hosted on different origins, and potentially subject to different regional laws.