w3c / vc-di-ecdsa

Data Integrity specification for ECDSA using NIST-compliant curves
https://w3c.github.io/vc-di-ecdsa/
Other
9 stars 9 forks source link

Tell implementers to use major type 2 encoding (not tag 64 -- Uint8Array) for CBOR #52

Closed dlongley closed 2 months ago

dlongley commented 5 months ago

The fact that JavaScript (on some platforms) uses Uint8Array to express byte strings / byte arrays should not escape into the proof serialization format. The proof data serializes bytes, it doesn't matter that JavaScript in a browser interacts with bytes using a Uint8Array interface. In other words, CBOR has a special tag that can be inserted to very specifically indicate that a Uint8Array is used -- when the simpler major type 2 expression (byte string) will do instead.

We should tell implementers to use major type 2 for simplicity (and not force implementers to work with Uint8Arrays, especially when working in other languages that express bytes in another way that wouldn't result in a special CBOR tag) but perhaps note that it does not affect the security / verifiability of the proofs as it's just a detail of the serialization format of the proof value.

filip26 commented 4 months ago

In the Example 60 - Signed Base Document Tag(64) is added to byte arrays.

Wind4Greg commented 4 months ago

I'll dig into this @dlongley and come up with a PR. I'll also check on the test vector generation @filip26 to conform to the PR.

Cheers Greg

Wind4Greg commented 4 months ago

Hi folks, had to dig into CBOR a bit to start to sort this out. First, in CBOR types are different from tags. See CBOR Major Types and for our byte arrays for signatures, keys, etc... major type 2 encoding is being used in all cases including the current test vectors.

Where we are seeing a difference between CBOR libraries is in the Tagging of items. "In CBOR, a data item can be enclosed by a tag to give it some additional semantics, as uniquely identified by a tag number."

Decoders do not need to understand tags of every tag number, and tags may be of little value in applications where the implementation creating a particular CBOR data item and the implementation decoding that stream know the semantic meaning of each item in the data flow. The primary purpose of tags in this specification is to define common data types such as dates. A secondary purpose is to provide conversion hints when it is foreseen that the CBOR data item needs to be translated into a different format, requiring hints about the content of items. Understanding the semantics of tags is optional for a decoder; it can simply present both the tag number and the tag content to the application, without interpreting the additional semantics of the tag.

I did a quick survey of the most popular JavaScript CBOR libraries (ordered from most popular): cbor -- this library automatically tags Uint8Arrays, raised an issue and working with the authors to see about making this optional; cborg -- this library doesn't seem to tag Uint8Arrays by default. cbor-x -- this library has an option (though a bit tricky to figure out how to use it) to turn off tagging of Uint8Arrays (it is on by default).

Controlling tagging requires more work by the implementer and may not be supported by all libraries in all languages. Are we trying for a deterministic encoding as discussed in RFC8949: Deterministically Encoded CBOR or do we just want to advise them that alternative valid encodings can be produced?

filip26 commented 4 months ago

I can confirm tags are not an issue to verify a signature. I've found that the example uses tag(64) when I was implementing it in Java, and was curious why my signature does not match the example with the same input vectors.

FYI: it's not hard to add a tag in Java, but I would rather recommend not using tags at all.

SDProofValue.java - tags added only to verify the implementation generates the same result as in the example.

dlongley commented 4 months ago

+1 to not using tags at all, it's extra complexity we don't need. It's probably just JavaScript implementations that have some issue with this. Every other language is likely not to bump into this at all and using tags will probably create an interop problem for all of them (a larger set). The JS implementations (1-2 libs?) that have trouble should be fixed (or not used in implementations, e.g., just use cborg which does not have the problem).

Wind4Greg commented 4 months ago

Hi all (@filip26 and @dlongley), had a conversation with the author of the most popular JavaScript CBOR library : cbor, he has a new library cbor2 which doesn't tag Uint8Arrays by default. Our discussion can be found here: node-cbor issue 191.

He thought it would be reasonable to turn off tagging as part of our specification, he also had some other items he might recommend but they don't seem to apply to our use-case (we're either byte strings or arrays of text).

So do we want a MUST turn off tagging, or a SHOULD turn off tagging? Or something less rigid? Let me know what folks think and I can come up with some text and regenerate the test vectors in the same PR.

Wind4Greg commented 4 months ago

Hmm, @filip26 I didn't see your name come up when adding reviewers to the above PR. Can you take a look. Cheers Greg

filip26 commented 4 months ago

@Wind4Greg thank you, but I'm not a member of the group, my review is not "substantive" ;)

msporny commented 2 months ago

PR #59 has been merged to address this issue; closing.