decentralized-identity / ethr-did-resolver

DID resolver for Ethereum Addresses with support for key management
Apache License 2.0
203 stars 71 forks source link

Verification method using EIP-712 #107

Closed clehner closed 3 years ago

clehner commented 3 years ago

EIP-712 is a way for an Ethereum identity (secp256k1 keypair) to sign structured data. This is supposed to be good for security and user autonomy since rather than asking the user to sign arbitrary bytes, instead the user is prompted with some structured data where it can be more obvious what the meaning of it is. To use EIP-712 for linked data proofs on verifiable credentials and presentations, existing proof types such EcdsaSecp256k1Signature2019 cannot be used, because they rely on the user being able to sign arbitrary bytes; instead, a new verification method / proof type would be needed. Would this be a potentially useful addition to did:ethr DID Documents?

Here is a screenshot of MetaMask showing a EIP-712 signing request of a hypothetical verifiable credential (not actually working yet, this is just to show what it could look like):

metamask-eip712-vc-example.png Source: demo.html.txt

More info about EIP-712

wyc commented 3 years ago

cc @awoie

mirceanis commented 3 years ago

Yes! I think it would be a great addition. The interesting part of EIP712 is the possibility of on-chain verification. Ideally, this new verification method would maintain this property.

awoie commented 3 years ago

@clehner when you go down this route, please ensure you register the LD-Proof in the LD-Proof Registry at https://w3c-ccg.github.io/ld-cryptosuite-registry/ and write a LD-Proof spec for that. You would also need to update the @context value to have support for your LD-Proof.

In general, my comment would be how do you ensure that did:ethr didn't rotate away from the ETH-key in MM before the VC was issued? I guess, you don't need to but the VC would not be valid.

For the same reason, onchain verification is an issue, because for onchain verification you would need to do onchain DID resolution to ensure your DID Doc still contains the ETH key.

Then, the question is what would be the benefit of having this type of proof if onchain verification cannot be done? If all you want to do is using MM or ETH-wallets for VC issuance and VP presentation, then you could use custom RPC messages for this.

Further, why is it options and not proof? Using options would result in a non W3C-compliant VC.

mirceanis commented 3 years ago

yes, on-chain verification would not work for all situations, but it is still fully possible when using delegates. Those are visible on-chain, and they appear in the off-chain did-document as ethereumAddress keys.

Note that I'm not advocating for general on-chain verification. This has some nasty privacy implications, but using a read-only on-chain method would give you the benefits without the privacy leak.

clehner commented 3 years ago

Thanks for the feedback, @awoie, @mirceanis. I'm glad the on-chain idea is useful, since that helps constrain the design.

how do you ensure that did:ethr didn't rotate away from the ETH-key in MM before the VC was issued?

The verifier should check that the signing key was valid for the DID at the time of issuance. This could be done by resolving the DID document with DID Parameter versionTime (ethr-did-resolver could be updated to support this) or versionId (did:ethr could specify how that would be used).

what would be the benefit of having this type of proof if onchain verification cannot be done? If all you want to do is using MM or ETH-wallets for VC issuance and VP presentation, then you could use custom RPC messages for this.

I assume you mean using existing proof types such as EcdsaSecp256k1Signature2019, asking the user to sign raw bytes using web3.eth.sign or eth_sign pre-https://github.com/ethereum/go-ethereum/pull/2940 to create the linked data proof. MetaMask Docs discourages using this these signature requests in production because of the risk of phishing. The data to be signed for a linked data proof is the concatenation of the hashes of the canonicalized linked data document (credential or presentation) and proof options. This is some random-looking data, so basically the user has to trust the app that the signing input means what it is supposed to (e.g. is not a transaction stealing their funds).

With a linked data proof type / verification method based on EIP-712, we should be able to ask the user to sign something where they can see what it is and what it means, ideally.

Further, why is it options and not proof?

A linked data proof separately hashes and canonicalizes the document (VC or VP) and proof options. We will have to override this behavior to instead pass the document and proof options to the EIP-712 signTypedData call. But I would preserve the behavior of passing the document and proof options separately up to that point. In VC HTTP API, IssueCredentialRequest is defined with properties credential for the credential and options for the linked data proof options. I followed that usage for the types in this example. But it doesn't seem to make a difference for signing, since the property name refers to a type definition in the EIP-712 request, and doesn't actually make it into the bytes to be signed. So maybe human-readable terms should be preferred for the property names, since it is primarily for showing the user; e.g. "Proof Options" for this one instead of "options".

jaredweinfurtner commented 3 years ago

Isn't that what https://eips.ethereum.org/EIPS/eip-1812 aims to achieve?

wyc commented 3 years ago

@jaredweinfurtner I believe the proposed method in EIP-1812 does not leverage Linked Data Proofs nor JWT, as per:

While built on industry standards such as JSON-LD and JWT neither of them are easy to integrate with the Ethereum ecosystem.

EIP-712 introduces a new method of signing off chain Identity data. This provides both a data format based on Solidity ABI encoding that can easily be parsed on-chain an a new JSON-RPC call that is easily supported by existing Ethereum wallets and Web3 clients.

In contrast, I don't believe that this verification method is intended for on-chain use cases as a primary goal, but instead uses LD-Proofs.

cc @pelle in case there is anything more to add here!

clehner commented 3 years ago

Hi folks,

I've implemented a working version of this verification method, in Rust as part of ssi/DIDKit: https://github.com/spruceid/ssi/pull/99. It is integrated into a demo application: https://github.com/spruceid/degen-issuer/pull/7.

Changes

Since the original proposal, it is now more linked-data-based rather than JSON-based. There are two main reasons for this.

  1. Arbitrary JSON objects cannot be used directly as EIP-712 structured data. A transformation step would be needed, such as encoding JSON as strings (which could be expensive on-chain), or some mapping from JSON to EIP-712 structs which would probably look awkward in the signing UI (e.g. a JSON object as a struct where the fields are arrays of properties for each JSON primitive type). While some parts of the W3C VC Data Model could be encoded with specific EIP-712 struct definitions, to fully support the data model, arbitrary JSON or RDF is needed.
  2. Linked data proofs typically use RDF Dataset canonicalization. Proof types Ed25519Signature2018 and EcdsaSecp256k1Signature2019, which are included in the VC base context, involve converting the linked data document (e.g. Credential) and proof from JSON-LD to RDF, applying RDF Dataset Canonicalization (URDNA2015), serializing as N-Quads and then hashing with SHA-256. Exceptions are linked data proof type JcsEd25519Signature2020, which applies JSON Canonicalization Scheme (JCS), hashes the JSON and then signs it, with no RDF involved; and JWT VCs which sign base64url-encoded JSON data without a canonicalization step. Having the signing input based on RDF is supposed to enable signed documents to be convertible between different representations while remaining verifiable. Compared to plain JSON, using JSON-LD is supposed to ensure that different parties handling the document share an understanding of the meaning of its terms, as e.g. each property name is expanded into an IRI, while the document can remain compact and human-readable due to moving term definitions into context.

Encoding

A linked data proof signing request is encoded for EIP-712 as a struct, LDPSigningRequest, with two fields, document and proof, representing the linked data document (e.g. credential or presentation) and proof, respectively. document and proof are both 2D arrays of strings, where each inner array encodes an RDF statement, containing 3 or 4 strings which are RDF terms as they would be encoded in NQuads. This is like NQuads but an array of lines instead of a single multi-line string, and an array of strings for each line.

This encoding aims to be similar to the usual way signing input for a linked data proof is created, while enabling the user to view and inspect the data they would sign. Encoding with 2D string arrays is supposed to help with readability. More descriptive encoding could be used, such as having each RDF statement be {"subject": …, "predicate": …, "object": …} rather than an array of strings; but in the MetaMask signing UI, struct field names take up horizontal space, reducing space for the content. Encoding terms as EIP-712 structs rather than N-Quads-encoded strings may be possible, but a solution would be needed for encoding the polymorphism of RDF terms without creating a lot of dangling empty properties, and there would still be the issue of horizontal space.

Until recently (https://github.com/MetaMask/metamask-extension/pull/10485), strings in MetaMask's signing request window would overflow with an ellipsis rather than wrapping, as you can see in the screenshot at the top of this thread compared to the ones below. Now that that has been fixed, it might make sense to changing the encoding of the RDF statements to use a single string per statement (N-Quads line), rather than an array of strings per line, so that it would be closer to standard N-Quads and require less custom encoding. But it probably should still be to be one string per line rather than a single multi-line string, since I don't think the UI would respect embedded newlines.

Lastly, note that there appears to be an inconsistency between eth-sig-util as used from MetaMask, and the EIP-712 specification, in hashing arrays: https://github.com/MetaMask/eth-sig-util/issues/106. The encoding I am proposing is affected by this, since it relies on arrays in the EIP-712 data. The implementation in DIDKit now follows eth-sig-util's signTypedData_v4 implementation. We will need to keep an eye on this if eth-sig-util/MetaMask changes its behavior, or if there are other implementations that implement it differently, or if there are updates to EIP-712.

EIP-712 typed data for a linked data proof signing request

{
  "types": {
    "EIP712Domain": [
      {
        "type": "string",
        "name": "name"
      }
    ],
    "LDPSigningRequest": [
      {
        "type": "string[][]",
        "name": "document"
      },
      {
        "type": "string[][]",
        "name": "proof"
      }
    ]
  },
  "primaryType": "LDPSigningRequest",
  "domain": {
    "name": "Eip712Method2021"
  },
  "message": {
    "document": [
      ...
    ],
    "proof": [
      ...
    ]
  }
}

Example

Here is the proposed EIP-712 encoding for a signing request for an example verifiable credential: eip712vm-typeddata.json.txt

The signing request in MetaMask looks like this: eip712vm-metamask-signature-request.png eip712vm-metamask-signature-request2.png

The resulting verifiable credential: eip712vm-vc.jsonld.txt

On-chain considerations

As @wyc noted, our use cases for this verification method are more for Ethereum wallet users to be able to use verifiable credentials and presentations, and I didn't have on-chain use cases in mind when opening this issue. But I think it would be great to support on-chain use cases to the extent possible. I think it would be difficult to implement linked data proofs on-chain in general, because of the computation needed to canonicalize, encode and hash the document and proof data, but I think it would be possible to do within some constraints. I also don't personally have experience in developing smart contracts, so my view here is limited.

Minimize blank nodes

One way to reduce the cost of linked data proofs could be to limit use of blank nodes. Canonicalizing a RDF dataset using URDNA2015 involves assigning canonical names to blank node identifiers in the dataset (e.g. _:c14n0 as seen in the screenshot). This may be expensive because it relies on SHA-256 hashing and intermediate data structures. Blank nodes generally result from JSON-LD objects that don't contain an @id property. If you can ensure that your input data does not have blank nodes, by giving every object an id, or contains at most one blank node so it is trivially canonicalized, you may be able to skip the full URDNA2015 and canonicalize the dataset simply by sorting the lines (as N-Quads).

Optimize sorting

URDNA2015 requires sorting RDF statements lexicographically. Deserializing JSON-LD to RDF also involves encoding data in a canonical lexical form which for JSON literals involves sorting keys lexicographically. There could be optimizations regarding this sorting that could help make on-chain issuance and/or verification more feasible. Issuance could work by constructing signing input with a template, rather than in a fully general way, while taking care that the input is in the canonical form. It could help with verification to require input to be pre-sorted and pre-canonicalized. Consider disallowing data types that are expensive to check if they are in canonical form, such as perhaps rdf:JSON.

Hashing

This thread suggests the cost of SHA-256 may be about 700-1300 gas: https://ethereum.stackexchange.com/questions/76110/gas-cost-of-a-sha256-hash. I think this hashing would only be needed for blank node canonicalization in URDNA2015.

String processing

Encoding as N-Quads or N-Quads-like strings requires some string processing, which might not be otherwise needed if terms were instead encoded as native EIP-712 structs. But fully using EIP-712 structs might be more inconvenient in the UI, and require more custom processing compared to N-Quads. Is this an acceptable trade-off?

EIP-1812

I didn't know about this, so thanks @jaredweinfurtner for pointing it out. It looks great for Ethereum-based uses. But I think it would be valuable to integrate with the W3C VC Data Model, to potentially interoperate with many issuers, holders and verifiers. A linked data proof type based on EIP-712 could enable existing wallet users to make use of these standards while remaining authoritative over what they are signing with their keys.

awoie commented 3 years ago

@clehner The problem with this approach is that the EIP712 input that gets displayed in MM (output from RDF canonicalization) is essentially a blob of data to the user and basically not better than personal.sign. What about the following idea?

// Require VCs to have a `credentialSchema` property pointing to a JSON schema (has to be a hashlink for integrity protection)
credential = createCredential(payload, schema)

// Transform JSON-LD to EIP712 (using JSON schema for type information)
data712 = convertToEIP712(credential)

// Run JCS canonicalization
data712 = canonicalizeJCS(data712)

// Sign the data using web3
signature = web3.signTypedData(data712)

// Add the 712 signature proof
verifiableCredential = addProof(credential, signature, 'EthereumEip712Signature2021')

For verification you would need to build the data712 again from the proof property and the credential itself. Verification would work with the Ethereum key.

You could store the data712 onchain, and use the verifiableCredential offchain.

You could use either an EcdsaSecp256k1VerificationKey2019, EthereumAddress or BlockchainAccountId in the DID Doc if you'd need that.

Update:

Some improvement might be that you would even able to encode the EIP712 types in the JSON schema, or use "A" EIP712 schema (if this exists) directly in the credentialSchema property but I would need to take a closer look at that. For the EIP712 schema, one would need to register a credentialSchema type.

awoie commented 3 years ago

Also tagging @wyc

awoie commented 3 years ago

I would also suggest naming the proof type EthereumEipSignature2021

clehner commented 3 years ago

The problem with this approach is that the EIP712 input that gets displayed in MM (output from RDF canonicalization) is essentially a blob of data to the user and basically not better than personal.sign.

The data as RDF/N-Quads should be able to be copy-pasted into other applications for further rendering or analysis before signing.

Better rendering of signing input will probably require support from the wallet. Or an additional extension to make the wallet render better, e.g. meta-metamask ;)

I attempt to describe the possible approaches for further discussion here: https://github.com/uport-project/ethereum-eip712-signature-2021-spec/issues/8

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

clehner commented 3 years ago

I'm closing this issue, as the idea has been implemented, and a specification is being drafted as a work item of the W3C Credentials Community Group: https://github.com/w3c-ccg/community/issues/194.

EthereumEip712Signature2021 maps linked data documents and proof objects to EIP-712 TypedData using a JSON-based approach, with type definitions. There is an issue to make type generation automatic: https://github.com/uport-project/ethereum-eip712-signature-2021-spec/issues/9 and an issue about preserving the RDF data model: https://github.com/uport-project/ethereum-eip712-signature-2021-spec/issues/10. @spruceid's implementation can be seen at https://github.com/spruceid/ssi/pull/213.

We also still have the suite using arrays of RDF statements (Eip712Signature2021) described earlier in this thread, but are preferring the one that is in CCG, for interoperability.

We've also been exploring using personal_sign in linked data signatures, for situations where EIP-712 signing is not available: https://github.com/spruceid/ssi/pull/212 (EthereumPersonalSignature2021).

While working out the usability implications of signing linked data, we are exploring use of "signing input explainer" tools, to help users make sense of what they are being asked to sign, e.g. https://github.com/spruceid/tzvm2021-explainer

EthereumEip712Signature2021 is specified as a proof type that can be used with existing verification method types (EcdsaSecp256k1VerificationKey2019 or EcdsaSecp256k1RecoveryMethod2020), rather than defining a new corresponding verification method type. Therefore, DID documents, methods and resolvers don't have to be updated to support the signature suite, only signers (VC issuers/presenters), and verifiers need to support the suite. CCG's VC HTTP API has also been updated to accommodate this use case, enabling issuers/presenters to select which proof type they want to created: https://github.com/w3c-ccg/vc-http-api/pull/197 (although admittedly, signing in the browser might not need VC HTTP API).