Closed letmaik closed 1 year ago
Faced the same issue.
@letmaik I've read the spec, given this a ton of thought, and I feel that overwriting is an appropriate handle per spec. It seems to leave room for implementation-specific parsing and I believe that an overwrite on duplicate key is appropriate fault-tolerance.
While there are more complex handling scenarios in other implementations (see for example https://github.com/fxamacker/cbor/issues/122), I question the complexity of such a feature at this time.
The fault-tolerance in the lib is also in-line with what developers should expect for JSON.parse
. Example:
const json_dup = `
{
"key1": "val1",
"key2": "val2a",
"key2": "val2b",
"key3": "val3"
}
`
console.log(JSON.parse(json_dup))
For these reasons, I would like to close this issue as wont fix; however, I feel that adding a feature for optional strictness like fxmacker/cbor
with a default for fault-tolerance would be acceptable. Anyone submitting a PR implementing this feature would be welcome.
This feature would be required to implement COSE validators in a secure manner: https://datatracker.ietf.org/doc/html/rfc8152
Applications MUST NOT generate messages with the same label used twice as a key in a single map. Applications MUST NOT parse and process messages with the same label used twice as a key in a single map. Applications can enforce the parse and process requirement by using parsers that will fail the parse step or by using parsers that will pass all keys to the application, and the application can perform the check for duplicate keys.
I believe from a language perspective that the first must not is satisfied by the encode process.
The second must not I am interpreting as "don't present more than one key/value with the same label to the application". Can you help me understand how this is flawed? I must be missing something (and there's a few other CBOR libs that would have also missed this, not an excuse, just part of my reasoning).
The decode process would need to add some intermediate code for either the hard fail or the passing of keys (maybe via some "reviver" style function [?] that can pump out the key/value pairs as they are decoded) to the application.
Thanks for the cycle of feedback, @letmaik.
I'm in favor of the following, with proper documentation:
mode: 'loose'|'strict'
, where loose is the defaultSomething like:
interface CBOROptions {
/** Set the dictionary type for supported environments; defaults to `object`. */
dictionary?: 'object' | 'map';
/** Set the mode to `strict` to hard fail on duplicate keys in CBOR dictionaries; defaults to `loose`. */
mode?: 'loose' | 'strict';
/** For environments which support Web Streams, get a new ReadableStream for every decode. */
observe?: (stream: ReadableStream<{ key: any, value: any }>) => void;
}
/** Sets or gets the internal options of CBOR decode/encode. */
CBOR.options (options?: CBOROptions): readonly CBOROptions;
What do you think of this idea @letmaik?
Setting options as global state is quite bad I think. It should be an argument to the relevant encode/decode functions. Otherwise two unrelated parts in an application can't use the library with different options at the same time.
Otherwise, the dictionary
and mode
options look good. I haven't used streams really to make a recommendation on the observe
API, but intuitively I would suggest to leave such an API out for now until someone asks for it. It's easy to mess up the design on such things.
Yuck, you're right, module-wide options wouldn't work well in this context at all.
What do you think of putting all the options in an options bag to be passed to decode
? This would be a breaking change to the decode API; but I'm planning on making this a 1.0 release, so I think that's acceptable.
Current API:
CBOR.decode<T = any>(
data: ArrayBuffer | SharedArrayBuffer,
tagger?: TaggedValueFunction,
simpleValue?: SimpleValueFunction,
) => T
Options bag API:
interface CBOROptions {
/** Set the dictionary type for supported environments; defaults to `object`. */
dictionary?: 'object' | 'map';
/** Set the mode to `strict` to hard fail on duplicate keys in CBOR dictionaries; defaults to `loose`. */
mode?: 'loose' | 'strict';
/** For environments which support Web Streams, get a new ReadableStream for each encountered object. */
observe?: (stream: ReadableStream<{ key: any, value: any }>, key?: any) => void;
/** A function that extracts tagged values. This function is called for each member of an object. */
tagger?: TaggedValueFunction;
/** A function that extracts simple values. This function is called for each member of an object. */
simpleValue?: SimpleValueFunction;
}
CBOR.decode<T = any>(
data: ArrayBuffer | SharedArrayBuffer,
options?: CBOROptions,
) => T
Side note: the more I think about the observe API, the more I dislike it and the less likely I am to include it in it's current proposed form.
Yeah, I'm going to drop observe
altogether and add reviver
to the decode API and replacer
to the encode API. I have a use case for that functionality, and it will be more ergonomic since it replicates something I like about the JSON APIs.
The options bag looks fine, I'm not worried about breaking changes for 1.0.
@letmaik Hey, may I pick your brain a little bit more? What do you think of the docs here? https://doc.deno.land/https://raw.githubusercontent.com/aaronhuggins/cbor-redux/0c8b3f4fa619fee986dd91e254d999fc9cfd396e/mod.ts
That document should be largely applicable to Node/Deno/Web. Would be very happy if you don't mind providing more feedback, but I understand if that's too much to ask.
I've got everything implemented per #17 for this issue and many others. I think I'm close to ready for a 1.0.0 release candidate. Really want to get CI/CD set up before cutting a prelease.
Either way, I'll update this issue once a prerelease is cut so you can take it for a spin.
The docs look ok to me, I think some code examples wouldn't hurt.
Maps with duplicate maps are not valid according to the CBOR spec. While the spec leaves it to the specific protocol where exactly this is dealt with I think it would be trivial to add this check to cbor-redux in this code path: https://github.com/aaronhuggins/cbor-redux/blob/22ba58fb7d8d30e2fc85c2f9dbad6f7c53785037/src/CBOR.ts#L230-L236