hildjj / node-cbor

Encode and decode CBOR documents, with both easy mode, streaming mode, and SAX-style evented mode.
MIT License
356 stars 73 forks source link

How to turn off taging on Encode for Uint8Arrays? #191

Closed Wind4Greg closed 6 months ago

Wind4Greg commented 6 months ago

Is there anyway to turn off tagging when encoding Uint8 arrays? Cheers Greg

hildjj commented 6 months ago

Can you give me a sample input and desired output, please?

Wind4Greg commented 6 months ago

Hi, I'm working on some standards involving verifiable credentials, but a very simple case would be:

const byteData = [new Uint8Array([0,1,2,3]), new Uint8Array([4, 5, 6, 7])]
const textData = ["hello", "world"]
const combination = [byteData, textData]
const cborThing2 = await cbor.encodeAsync(combination)

The Hex encoded output is: 8282d8404400010203d8404404050607826568656c6c6f65776f726c64. If you throw this into the CBOR Playground you can see the tag(64) on the Uint8Arrays. Using other JavaScript CBOR libraries, that either don't tag or allow one to turn off tagging (for Uint8Array) I get the hex value: 828244000102034404050607826568656c6c6f65776f726c64 which doesn't have the tag(64) entries.

The overall application is for selectively disclosed verifiable credentials where CBOR is used to encode cryptographic information which is both binary and text arrays and folks are desiring a deterministic output (our inputs are pretty simple). As discussed in RFC8949: Additional Deterministic Encoding Options we'd like to turn tagging off.

Cheers Greg

hildjj commented 6 months ago

Nod. cbor2 does this already, any chance you could switch to that?

hildjj commented 6 months ago

If not, this is a feature request for adding preferWeb to the encoder.

hildjj commented 6 months ago

Sorry, one more comment, if you wrap a Buffer.from() around your Uint8Array, you probably get the output you want.

Wind4Greg commented 6 months ago

Hi @hildjj didn't know about your updated cbor2. Will move to that for test vector generation for the spec. One last question, since you're a CBOR expert. Would it be reasonable for a spec to tell folks to turn off tagging? It seems to me harder for implementers since they have to dig into these options. I was thinking about just advising implementers that different encoders can produce different outputs. Cheers Greg

hildjj commented 6 months ago

It's completely reasonable for a spec to use a subset of CBOR, and to call that out. Other things you might want to explicitly forbid:

The goal here is that you could theoretically write a CBOR decoder for your protocol with less complexity, smaller size, or faster performance if desired. Your hope would be that a general-purpose CBOR decoder would still decode the CBOR subset you choose.

A more drastic path would be to require a general-purpose CBOR decoder to throw errors for all of your selections above. I would strongly recommend against that, because many generic CBOR decoders do not have knobs for enabling those errors.

Wind4Greg commented 6 months ago

Thanks for the info @hildjj and advice!