w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
318 stars 55 forks source link

Verifiable Credential Status List 2021 #874

Closed mkhraisha closed 4 months ago

mkhraisha commented 11 months ago

I'm requesting a TAG review of Verifiable Credential Status List 2021.

This specification describes a privacy-preserving, space-efficient, and high-performance mechanism for publishing status information such as suspension or revocation of Verifiable Credentials.

Further details:

You should also know that this work intersects heavily with the Verifiable Credentials v2.0 work, which is also something that the TAG will be actively reviewing around the same time

We'd prefer the TAG provide feedback as (please delete all but the desired option):

☂️ open a single issue in our GitHub repo for the entire review

rhiaro commented 10 months ago

Hi @mkhraisha thanks for your review request.

Is it possible for an issuer to use their own value for the statusPurpose field? It's clear that the strings revocation and suspension must be used correctly as defined in the spec, but given the extensible nature of JSON-LD it looks like it would be possible for additional terms to be introduced here. Is there a risk of this being overloaded and potentially leaking other information about the credential? Should the spec be explicit about constraining the values only to these strings, or has it been deliberately left open to permit other strings to be used without additionals to the spec? If it's the latter, what other (legitimate or malicious) values do you think we might see here?

Do the values of statusMessages carry simiar risks related to overloading/data leakage, as these are defined by the issuer?

I have more general concerns about malicious issuers tracking credential holders, which I've no doubt has been thought about at length in the WG and wider community. It would be great to see pointers to more work on this, and mitigations in particular, given the types of organisations which are likely to issue credeintials, the limited options people may have for credentials that are accepted, and the power dynamics involved here.

Thanks for your suggestion in the Security & Privacy questionnaire about asking about maturity of dependencies. I've raised an issue to add this to the questionnaire. Do you have an answer in mind for this question for the VC Status List spec?

OR13 commented 9 months ago

@rhiaro There are several kinds of statuses I've seen in the wild... Ones that go into a registry / list, which is expert curated... and ones which are open ended, and where you need to understand what the issuer wanted the status to mean, and each issuer can assign that value a different meaning... for example 0xdeadbeef might mean "Organic Certification Passed" or "Formal Objection Present", depending on the issuer.

The other case is where a registry defines each status, and the RDF class ensured consistent meaning via term URLs:

Or a status list might not support multiple status, in which case, each bit is bound to a single meaning, regardless of where that meaning is defined.... this last category has been most contentious, since people fear that we will see issuer's creating a lot of implementation burden, on verifiers, by using multiple lists, with "custom status" values.

AFAIK, the WG has not defined a way to support the "open world model for status'es", for either the single list case, or the multiple lists cases, but there is ongoing work to provide clarity on this topic.

cynthia commented 9 months ago

Dumb question... What's a bitstring?

mkhraisha commented 8 months ago

Dumb question... What's a bitstring?

It is just a string of bits, https://csrc.nist.gov/glossary/term/bit_string

hadleybeeman commented 8 months ago

Hi all. We are looking at this again in our W3C TAG breakout.

@OR13, it sounds like you are replying to @rhiaro's question on your own behalf, but not on behalf of the working group. Is it possible to get a working group opinion on this? @brentzundel or @sakurann, as working group chairs, perhaps you could help with this?

rhiaro commented 7 months ago

Sorry for the delay on following up. Have you had chance to find pointers to any work on mitigations to the risks of malicious issuers and an answer to the suggested security question about dependencies?

msporny commented 7 months ago

Sorry for the delay on following up. Have you had chance to find pointers to any work on mitigations to the risks of malicious issuers and an answer to the suggested security question about dependencies?

Hi @rhiaro, apologies for the delayed response. The VCWG has been heads down getting Data Integrity and VC JSON Schema into CR (this was done as of last week). We are now heads-down trying to get the VCDM v2.0 into CR (target date is last week of December). After we get through that hurdle, we'll shift our focus to "Verifiable Credential Status List 2021", which has been renamed to "Bitstring Status List".

As such, we will probably not have a WG response on a timeline before the upcoming holiday break.

I'll respond as an Editor of the specification, but this probably warrants TAG raising an issue on the spec and us marking it as "resolve before Candidate Recommendation". Note that my responses represent what the specification does today, not what we might change it to do in the future (based on TAG, implementation, and other Horizontal Review feedback):

Is it possible for an issuer to use their own value for the statusPurpose field?

At present, yes, it is possible to do that.

It's clear that the strings revocation and suspension must be used correctly as defined in the spec, but given the extensible nature of JSON-LD it looks like it would be possible for additional terms to be introduced here.

The statusPurpose property value is a free form string, not a JSON-LD typed value... which means, "Any string can be used here". When implementations use strings that are not defined by the specification, the behavior is undefined.

Is there a risk of this being overloaded and potentially leaking other information about the credential?

Yes. The more information you encode in the status list information, the more possibility there is for correlation (based on the extra bits of information you're getting). For example, we will most likely need to put Privacy Considerations entries in the specification noting the use of the (concerningly named) "status" value for the "statusPurpose" field.

Should the spec be explicit about constraining the values only to these strings, or has it been deliberately left open to permit other strings to be used without additionals to the spec? If it's the latter, what other (legitimate or malicious) values do you think we might see here?

It's the latter, we wanted multiple statuses to be able to be expressed... not all of them contemplated by the VCWG.

As for legitimate values, one could conceive of: "sanctioned", "suspended-pending-review", "probation", "inactive", "under-review", "revoked-with-cause", "revoked-without-cause" ... though we don't know of anyone actively pursuing those statusPurpose values. In short, we don't know what we don't know and we didn't want to close off this particular extensibility mechanism at this point in time.

As for, "Why a string instead of a JSON-LD type"? We've noted that there is a point of diminishing returns to use certain types. For example, with Data Integrity, we found that there was an explosion of JSON-LD Contexts for proof types and that we could reduce that explosion by just providing a single DataIntegrityProof type and provide a string-based cryptosuite property that could be used by individuals and organizations that wanted to extend the DataIntegrityProof base type. This meant that you could use the same base VCDM v2.0 JSON-LD context to support a very large set of proof extension types w/o having to create yet another JSON-LD Context to specify the type values in the "cryptosuite" property. This reduces the implementation burden for cryptosuite authors as well as implementers. We are applying the same design approach here to see if we can get the same benefits.

When it comes to malicious values, that implies a malicious issuer, and if you have a malicious issuer all bets are off. A malicious issuer can issue VCs that say horrible and untrue things about you, they can construct status lists that are designed to track you, and do a number of other harmful things. The key there to prevent abuse is to understand whether the behavior can be detected such that the issuer can be punished for that behavior (regulatory action, public shaming, boycott, loss of market share to more privacy-preserving issuer, etc.). We do speak to this in the current spec:

https://w3c.github.io/vc-bitstring-status-list/#malicious-issuers-and-verifiers

and the VCDM spec:

https://w3c.github.io/vc-data-model/#issuer-cooperation-impacts-on-privacy

... but that might not be enough. Can you think of something more we should say in those sections (or others)?

Do the values of statusMessages carry simiar risks related to overloading/data leakage, as these are defined by the issuer?

Yes, they do.

I imagine that we will probably want to say more about this, and a fresh set of eyes (the TAG's) will help focus our attention on what's missing and what more we should write.

Again, this is commentary as an Editor of that specification, not a VCWG position.

rhiaro commented 4 months ago

Thanks for your comprehensive reply, and for your patience while we loaded this back into our collective heads. We think it's important that you continue to explore and elaborate on mitigations for the privacy concerns raised in this thread, and in the horizontal review with PING. We're closing this as 'satisfied with concerns' because it's clear that you intend to do so, though we think it's important that these issues are resolved before the spec goes to REC.