w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
318 stars 55 forks source link

BBS Cryptosuite v2023 Securing Verifiable Credentials with Selective Disclosure using BBS Signatures #922

Closed Wind4Greg closed 4 months ago

Wind4Greg commented 6 months ago

こんにちは TAG-さん!

I'm requesting a TAG review of BBS Cryptosuite v2023 Securing Verifiable Credentials with Selective Disclosure using BBS Signatures.

The BBS Cryptosuite v2023 specification describes a mechanism for ensuring the authenticity and integrity of Verifiable Credentials and similar types of constrained digital documents using cryptography, especially through the use of digital signatures and related mathematical proofs. It is one of several cryptosuites within the VC Data Integrity framework. This specification offers constant size signatures over multiple messages, selective disclosure and unlinkable proofs.

Further details:

You should also know that...

This is a new cryptosuite for the VC Data Integrity framework which provides for the additional privacy features of selective disclosure and unlinkable proofs. It is based on the soon to be standardized BBS signature scheme at the IETF.

We'd prefer the TAG provide feedback as (please delete all but the desired option):

🐛 open issues in our GitHub repo for each point of feedback

martinthomson commented 6 months ago

@Wind4Greg, I'm getting a 404 on https://github.com/w3c/vc-di-bbs/blob/main/EXPLAINER.md Perhaps you should merge your PR.

msporny commented 6 months ago

Perhaps you should merge your PR.

https://github.com/w3c/vc-di-bbs/pull/105 merged, the EXPLAINER.md URL should resolve now, @martinthomson.

hadleybeeman commented 5 months ago

Hi all, Thanks for sending this our way. We are explicitly not reviewing the BBS Digital Signature Algorithm itself (for avoidance of doubt), but your cryptosuite using it seems fine to us.

We note that verifiable credentials signed with BBS aren't likely to be interoperable with those signed with other algorithms. (We recognise this is probably unavoidable, given the architecture you're using.)

Let us know if you have any specific questions or issues for us. Otherwise, we are happy to close it.

msporny commented 5 months ago

Hi all, Thanks for sending this our way.

Thank you for the review, we really appreciate it!

We are explicitly not reviewing the BBS Digital Signature Algorithm itself (for avoidance of doubt)

Understood and expected, the BBS digital signature algorithm itself (the low level primitives we're using in the W3C vc-di-bbs specification) is currently undergoing cryptographic review at the IETF.

but your cryptosuite using it seems fine to us.

Ok, good to know.

We note that verifiable credentials signed with BBS aren't likely to be interoperable with those signed with other algorithms. (We recognise this is probably unavoidable, given the architecture you're using.)

Hmm, there is nuance here that's important. The W3C Data Integrity specification, which the TAG has already reviewed, enables a single data payload to be secured using multiple cryptography suites in parallel. I believe that TAG is already aware of this feature, but the statement above makes me wonder if we should write more about parallel signatures in the specification, perhaps with a few diagrams demonstrating this particular feature of W3C Data Integrity?

The parallel signatures feature does allow a BBS signature to "sit beside" an ECDSA signature and an EdDSA signature (you can have all three, in parallel, on a single verifiable credential). Clearly, this requires the issuer to issue the credential with each signature type for this to work. It also requires the verifier to support multiple different types of cryptosuites. This enables the holder of that verifiable credential to choose which mechanism to use when interacting with a verifier, which we believe enhances privacy and unlinkability for the individual. Why send all the information in a government-issued ID over when all you're trying to do is provide a registered mailing address?

So, the nuance around "aren't likely to be interoperable with those signed with other algorithms" is a bit off and I wanted to clarify the above.

More specifically, it would be useful for the TAG to comment on the desirability of such a feature set. At present, there are national governments that are suggesting that they settle on digital signature mechanisms that create tracking dangers. Most digital signature schemes today do not generate unlinkable signatures (like BBS does) and act like a super tracking cookie (the digital signature itself is globally unique so the individual can be correlated across all interactions that they have in person and on the web if parties collude). This is, clearly, not an ideal state of affairs and the WG is endeavoring to provide a feature that can combat this sort of pervasive tracking. That does not mean there aren't legitimate uses for more traditional signature schemes... if the data being sent over has personally identifiable information in it, then no amount of unlinkability when it comes to the digital signature will help. We do talk about this in the Verifiable Credentials specification.

To put it another way, for digital credentials, does the TAG find it acceptable for a software ecosystem to NOT provide an unlinkable digital signature solution at all? I admit that this is a bit of a loaded question as the EU seems to be on track to NOT providing such a solution in their digital wallet reference framework and some agencies in the US seem to be on track to require such a feature in digital wallet solutions. This affects the Identity Credential work that is being incubated in the W3C WICG as well. Would the TAG be willing to weigh in on this, and if so, would a meeting about it with the VCWG be appropriate?

Let us know if you have any specific questions or issues for us. Otherwise, we are happy to close it.

The only specific issue that would be good for TAG to have a position on is the above. Are we at a point where providing unlinkable digital signatures SHOULD/MUST be supported in future-facing digital credential solutions, or are we not there yet?

jyasskin commented 5 months ago

Note that the PING is also looking at privacy requirements for credentials, at https://github.com/w3cping/credential-considerations, but they're not very far along with the unlinkability requirements. (cc/@npdoty)

Personally:

  1. I agree with Manu that it's important for the technology to at least make it possible to create unlinkable credentials.
  2. This review probably isn't the right place to take that position: it belongs on vc-data-model or possibly as an independent finding.
  3. The TAG is more credible than the PING in saying something like this, because the TAG has the responsibility to make tradeoffs and to sometimes sacrifice an aspect of privacy if that's the right tradeoff in a particular case.
  4. I'm worried that the BBS spec doesn't do enough to help the ecosystem evolve toward actually-unlinkable credentials, since there are lots of linking mechanisms outside of the cryptography: https://github.com/w3c/vc-di-bbs/issues/110. I'm not certain the BBS spec is the right place for this guidance, but I didn't find a better place in the existing specs that are close to CR.
Wind4Greg commented 5 months ago

Hi all, (@hadleybeeman, @msporny, @jyasskin), in the privacy considerations section on unlinkability I took a somewhat layered approach to analyzing the "unlinkability" that is fairly generic:

  1. Artifacts from cryptographic primitives. (BBS particulars here)
  2. Artifacts from mapping a VC into a set of statements suitable for selective disclosure. (For our case using RDF canonicalization to produce "messages" suitable for use in BBS).
  3. Artifacts from Proof Options and Mandatory reveal Information in the VC. (things like "created" dateTime, and other items that the issuer puts in the proof options or requires the holder to reveal).
  4. Selectively revealed information in the VC. (The higher level info disclosed, very much outside our control, but must be taken into account in "linkage attack" analysis)
  5. External VC System Based Linkage -- Stuff outside our control, IP addresses, other networking artifacts, etc...

The basic analysis uses the concept of an anonymity set and reduction in the size of this set via "linkage attacks". The W3C has some specific guidance with respect to Mitigating Browser Fingerprinting in Web Specifications which uses the anonymity set concept. We can offer some fairly specific (somewhat quantifiable) advice on items 1-3.

For item 4, I cited the very recent work that deals with higher level information, e.g., such as the contents of the VC: SoK: Managing risks of linkage attacks on data privacy. J. Powar; A. R. Beresford. Proceedings on Privacy Enhancing Technologies. 2023. URL: https://petsymposium.org/popets/2023/popets-2023-0043.php.

I didn't say much about item 5. Its important but the section was getting long enough. I've got a networking background and when teaching cybersecurity would always make my students visit a website/service that provides IP address geo-location services to show them how easy it is to localize them by IP.

I'd be happy to help/contribute more text to a TAG or PING document that wants to address this topic more generally. I'm currently an "invited expert" to the VCWG. I'm also working with BBS at the IETF/DIF.

Cheers Greg

rhiaro commented 4 months ago

We (@torgo @ylafon @maxpassion and I) discussed this again today and we don't have anything further to add at this time regarding the issue raised by Manu. We agree that it would be useful to have unlinkable credentials. We don't have an opinion at the moment about whether this should be a requirement, but it also shouldn't be excluded as a possibility. We think this issue should be discussed further in PING and we look forward to reviewing the outcome of their discussion.

martinthomson commented 3 months ago

It has been brought to my attention that the proposal makes claims about unlinkability that are not backed with independent analysis.

The process for constructing and validating a BBS-backed credential involves multiple transformations. A JSON-LD data model is canonicalized, then this specification defines some HMAC-based transforms for identifiers, which is ultimately passed to BBS in order to produce a proof.

So, while BBS can in theory provide the desired privacy properties and the specification does address the potential for leakage, I was not able to find any formal analysis that supports the claims that are made. Where I spent a lot of time in the IETF, we are increasingly asking for security and privacy claims like these to be backed by stronger arguments, either through proofs of security or the use of formal/symbolic analysis software packages.

The claims that most interest me are those that relate to linkability. The document appears to use a novel method of protecting privacy, based on the use of a PRF. A more thorough analysis of that process would help.

msporny commented 3 months ago

A more thorough analysis of that process would help.

Ping @simoneonofri, W3C's new Security Lead, who is heading up that effort.

The document appears to use a novel method of protecting privacy, based on the use of a PRF.

Hmm, can you explain your concern here a bit more @martinthomson? It uses a standard HMAC to scramble text strings, which is a standard cryptographic technique. Is there a particular portion of this mechanism that you are concerned about, such as how it might mix in a way that undoes the linkability at the BBS layer? Or are you more concerned that the use of an HMAC on a text string wouldn't scramble the original identifier to a significant degree, thus leading to linkability? We are happy to guide the independent reviewers towards questions that the TAG might have, however, we want to make sure we provide them with a specific question to look into and perform an independent review on.

Wind4Greg commented 3 months ago

Looking forward to the more thorough and independent analysis.

There were two main issues that I tried to address in the writeup (a) data leakage via blank node identifiers. That was the long example with windsurfing that showed a information leakage of sail sizes and the mitigation of using a random shuffle via a PRF. (b) linkage (unlinkability) attacks, i.e., reduction in the anonymity set. This is a younger area so I quoted a recent SoK on this and tried to give a methodical approach to understanding and potentially computing this in a particular application. At the IETF we added some of this information into the BBS draft.

Cheers Greg

simoneonofri commented 3 months ago

Thanks for the message, @martinthomson, That's a very interesting point.

@msporny, I also add @jaromil and @andrea-dintino, so we can put it on the agenda for the call.