mattrglobal / jsonld-signatures-bbs

A linked data proof suite for BBS+ signatures
Apache License 2.0
138 stars 42 forks source link

proof blank node and subject blank node collide #158

Open brianorwhatever opened 3 years ago

brianorwhatever commented 3 years ago

So I have been debugging why I can't get this document to derive a credential that verifies.

{
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://w3id.org/security/suites/bls12381-2020/v1",
    "https://w3id.org/vaccination/v1"
  ],
  "type": [
    "VerifiableCredential",
    "VaccinationCertificate"
  ],
  "id": "urn:uvci:af5vshde843jf831j128fj",
  "name": "COVID-19 Vaccination Certificate",
  "description": "COVID-19 Vaccination Certificate",
  "issuanceDate": "2019-12-03T12:19:52Z",
  "expirationDate": "2029-12-03T12:19:52Z",
  "issuer": {
    "id": "did:key:z5TcDJgBp1A6iGQkpyDP4LZmsgpWVBqBVs3KoqNDMLKH7nK8UETzM6V1Js28Awuzzd4pJQSsLofZvNNwLmpFDfbzZmMXJNEFT4EriZ1NXjGK7q96781MsG1S3J9f2AiPez6Ne7eCFVtgfR4GPAarS4RWuzPmbi1egGwWrykaft98UcKQucirFv9s2x5D32K54sjwSC2yM"
  },
  "credentialSubject": {
    "type": "VaccinationEvent",
    "batchNumber": "1183738569",
    "administeringCentre": "MoH",
    "healthProfessional": "MoH",
    "countryOfVaccination": "NZ",
    "recipient": {
      "type": "VaccineRecipient",
      "givenName": "JOHN",
      "familyName": "SMITH",
      "gender": "Male",
      "birthDate": "1958-07-17"
    },
    "vaccine": {
      "type": "Vaccine",
      "disease": "COVID-19",
      "atcCode": "J07BX03",
      "medicinalProductName": "COVID-19 Vaccine Moderna",
      "marketingAuthorizationHolder": "Moderna Biotech"
    }
  },
  "proof": {
    "type": "BbsBlsSignature2020",
    "created": "2021-09-23T22:19:48Z",
    "proofPurpose": "assertionMethod",
    "proofValue": "p3qddAKNqC71FvxBPsAX3gHFYomkmqlzFvgPf8XvadxtLdLrPdlFmKsY/sFHcFnTTl+KSkouXACij1fiq/iHt+QOOmjJ+Bvm6XrMm7A7k7IJiOOhuvC+QP9OLjKg0tq4w8MIZnvoFRvGl40sUXwo9Q==",
    "verificationMethod": "did:key:z5TcDJgBp1A6iGQkpyDP4LZmsgpWVBqBVs3KoqNDMLKH7nK8UETzM6V1Js28Awuzzd4pJQSsLofZvNNwLmpFDfbzZmMXJNEFT4EriZ1NXjGK7q96781MsG1S3J9f2AiPez6Ne7eCFVtgfR4GPAarS4RWuzPmbi1egGwWrykaft98UcKQucirFv9s2x5D32K54sjwSC2yM#zUC7KbUZckosdKwDffzDeYDQSb8sbYyfhZ96gA3tCSYqgv4G1pq6A8pvY1kJGV6dr8w8ScuqKGXL16KxsRDL7FGgoEJ2Z4udTMVMcfGrSgUH3chPAmJNk6TBQ5hgimjBGY68icw"
  }
},

and the following frame

{
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://w3id.org/security/suites/bls12381-2020/v1",
    "https://w3id.org/vaccination/v1"
  ],
  "type": ["VerifiableCredential", "VaccinationCertificate"],
  "credentialSubject": {
    "countryOfVaccination": {},
    "type": ["VaccinationEvent"],
    "@explicit": true
  }
}

I have finally found what I think is the problem. When I step through the code at this line

https://github.com/mattrglobal/jsonld-signatures-bbs/blob/a1271750ed5211f1210323534402575967e4fae2/src/BbsBlsSignatureProof2020.ts#L192

I notice that the concatenation of the proof statements and the document statements results in this list Screen Shot 2021-09-23 at 3 46 49 PM

I believe my issue is that the _:c14n0 blank node identifier has a collision between the proof and the vaccine..

Is there something I am missing?

OR13 commented 3 years ago

I've seen similar issues before.... I happen to think this is a better normalization approach:

https://github.com/transmute-industries/merkle-disclosure-proof-2021/blob/main/src/merkle/normalization/json-pointer.ts

brianorwhatever commented 3 years ago

me too... but since we have urdna to work with for now what's happening here?

this comment concerns me

https://github.com/mattrglobal/jsonld-signatures-bbs/blob/a1271750ed5211f1210323534402575967e4fae2/src/BbsBlsSignatureProof2020.ts#L160

proof statements are always first?

brianorwhatever commented 3 years ago

doesn't look like that's the case here

OR13 commented 3 years ago

https://github.com/decentralized-identity/waci-presentation-exchange/issues/115

If test vectors can't be generated for the example here, we may need to consider not using the suite.

dlongley commented 3 years ago

For LD integrity proofs, the proof field is removed from the document prior to canonicalization and its contents are canonicalized separately. For suites other than BBS+, the canonized proof and the canonized document are separately hashed and those hashes are concatenated together to form the value that is passed to a crypto algorithm to generate the signature. This can be seen in an implementation here:

https://github.com/digitalbazaar/jsonld-signatures/blob/main/lib/suites/LinkedDataSignature.js#L229-L256

And is described in the spec here:

https://w3c-ccg.github.io/ld-proofs/#create-verify-hash-algorithm

This is done to allow multiple proofs to be independently created and attached to a document without coordination amongst the various signers / proof creators (the proofs altogether forming a "proof set").

It looks like the BBS+ suite implementation that is referenced here combines all of the statements (from the proof and the document) together:

https://github.com/mattrglobal/jsonld-signatures-bbs/blob/a1271750ed5211f1210323534402575967e4fae2/src/BbsBlsSignatureProof2020.ts#L244-L265

Which seems, to me, to be a mistake that will produce conflicts (likely the root cause of the above). I would think that keeping these two sections separate and using two different index transformation maps would resolve the issue.

cc: @tplooker, @kdenhartog

dlongley commented 3 years ago

@brianorwhatever,

doesn't look like that's the case here

Simply pasting a signed document into the JSON-LD playground's canonize tool will canonize the entire document vs. separating the proof value out and canonizing it separately to allow for "proof sets" (as described in my comment above). So what's missing here is that it's important to separate out the document's statements from the proof's statements. Both sets of statements are indeed signed, but they need to be canonized separately to allow for multiple signers to sign the document independently, if desired.

OR13 commented 3 years ago

+1 @dlongley ... IMO it was a mistake to "seperate the proof" before normalization... the complexity of this that the container semantics for proof are brutal....

"proof": {
"@id": "sec:proof",
"@type": "@id",
"@container": "@graph"
},
dlongley commented 3 years ago

One more note:

me too... but since we have urdna to work with for now what's happening here?

You'd run into the same issue of needing to separate document statements from proof statements with any other canonize function (e.g., json pointer, base64url, etc.) if multiple independent signatures are to be supported, so the problem is really orthogonal.

OR13 commented 3 years ago

Here is the code we used for the merkle proof suite... https://github.com/transmute-industries/merkle-disclosure-proof-2021/blob/main/src/merkle/normalization/urdna2015.ts

IMO, proof should be handled along with the document... not separately from it... only the attributes of the proof that come from the signature should be omitted by createVerifyData....

This allows for us to replace URNDA with JSON Pointer, or any other objectToMessages compatible operation with an inverse messagesToObject.

dlongley commented 3 years ago

@OR13,

Those @graph semantics wouldn't change even if you didn't separate out the proof before canonizing. Separating the proof before canonizing just allows independent signatures to occur. If you skipped that step, all signers would have to coordinate, but the proof semantics you quoted would still need to remain. They ensure that someone reading the document as a graph will interpret the data the same way that someone reading it like a simple JSON tree will. It's like separating meta data from data -- there are number of ways to get burned when you don't do that.

dlongley commented 3 years ago

It's fine if you want to canonize everything together -- just know that the trade off is that you can't have multiple independent signers and / or proofs.

OR13 commented 3 years ago

@dlongley yes, I have been weighing the trade offs wrt that... IMO, this not the appropriate place to address multi-signature formats... that work should be handled in a work item targeted at extending JOSE such as https://github.com/json-web-proofs/json-web-proofs/pull/10

We're shooting ourselves in the foot by making suite implementers care about all these concerns at once... instead of layering standards approaches on top of each other.

I believe there was originally language related to ProofSets or ProofChains that addressed this, but it is not implemented or tested or portable to VC-JWT, so IMO it should be avoided, until it is standardized better.

brianorwhatever commented 3 years ago

I have created a PR #159 demonstrating a failing test. I suspect what @dlongley pointed out here https://github.com/mattrglobal/jsonld-signatures-bbs/issues/158#issuecomment-929375910 is the root cause of the issue here

tplooker commented 3 years ago

Thanks @brianorwhatever apologies it has taken me a few days to get to this issue, I see the cause now will start working on a fix tonight.

swcurran commented 2 years ago

Any updates on this issue? It would also be good to know the scope. Can a credential schema or a presentation request be designed to avoid the issue?

OR13 commented 2 years ago

@swcurran https://github.com/decentralized-identity/bbs-signature/issues/10#issuecomment-939546272

Can a credential schema or a presentation request be designed to avoid the issue?

yes, but such a credential cannot have any "blank nodes"...

tplooker commented 2 years ago

To update this issue, upon investigation of the root cause, it is not due to subject blank nodes colliding it is because in some JSON-LD framing operations (that feature more than 1 blank node) the order of the statements in the derived proof can be different to the order in which they were signed. The syntax for the deriveProof API accommodates this (via the revealedMessages array), however the expression that gets appended to the proof value (the revealed messages bit array) does not account for this re-ordering, hence the signature layer is returning false when validating a derived proof because they are failing the integrity check. As this issue captures when it comes to what we support at the cryptographic layer we have a choice to make. We are currently doing 2 however some are arguing to do 1, in either of these cases the LD-Proof suite would need at a minimum a mapping array to accomodate the potential for the statements to be re-ordered.

In the interest of progressing this here is one potential proposal define a new term like revealedMessagesIndicies whose value will be an integer array that maps the revealed messages in the derived proof back into the order in which they were originally signed. So for example the proof block may look something like this

{
      "@context": [
        "https://www.w3.org/2018/credentials/v1",
        "https://w3id.org/security/suites/bls12381-2020/v1",
        "https://w3id.org/vaccination/v1"
      ],
      "id": "urn:uvci:af5vshde843jf831j128fj",
      "type": [
        "VaccinationCertificate",
        "VerifiableCredential"
      ],
      "description": "COVID-19 Vaccination Certificate",
      "name": "COVID-19 Vaccination Certificate",
      "credentialSubject": {
        "id": "urn:bnid:_:c14n2",
        "type": "VaccinationEvent",
        "batchNumber": "1183738569"
      },
      "expirationDate": "2029-12-03T12:19:52Z",
      "issuanceDate": "2019-12-03T12:19:52Z",
      "issuer": "did:key:z5TcDVRemCWTd6qwxxhFeYDpQhK4pUYtuutodKP2MSDjzcqokf6cNCARsaF8JNZQ8FzWAYfFZbUqUCUDMWeWp8xVkRSr9z5Tf5k2tJgpjsqNM23E4VjHsCzN6WcSvLGKSA9VEMTc1d2F81mCCauerPY1VC8vPTkvtmEQZfmaZ54x15PJwbhkBxaEbydWjd7D2CWHbkFg9",
      "proof": {
        "type": "BbsBlsSignatureProof2020",
        "created": "2021-10-13T03:33:54Z",
        "nonce": "KhUkvFvwx0O/s2pDSruIDg1AWrobAOWKqKLAy2n5UPFdY0dbCFpbnELCQnfFQBk+GXU=",
        "proofPurpose": "assertionMethod",
        "proofValue": "<proof-val>",
        "revealedMessageIndicies": [ <integer array> ],
        "verificationMethod": "did:example:489398593#test"
      }
    }
yamdan commented 2 years ago

@tplooker FYI, I have extended this suite to add several features, including multi-byte array representation of revealed indicies, which solves this issue. It looks like corresponding to no.3 in your comment. In other words, revealedMessageIndicies is embedded in proofValue as an array of unsigned integer using CBOR.

The extended suite is here, which depends on extended bbs-signatures and extended bls12381-key-pair. We can see the case discussed here is passed in this test. Note that blank node IDs are anonymized into UUIDv4-like pseudorandom values like urn:anon:xxxxxx....

I am afraid that there are very few documents about my extension at this moment... I will prepare more documents to explain it later, but at this moment you can check the brief presentation in IIW33 and try it out using playground.