greglook / clj-cbor

Native Clojure CBOR codec implementation.
The Unlicense
70 stars 7 forks source link

Canonical mode #2

Closed greglook closed 7 years ago

greglook commented 7 years ago

The CBORCodec type should support canonical mode per the RFC. This would ensure that map keys and set entries are written to the output in sorted order, ensuring that data structures with equal content are always serialized to the same bytes. This is a useful property in many situations, including caching and content-addressable storage.

It should be noted that the specification calls for sorting the values by serialized bytes, which requires serializing all of the entries into memory before writing. This could be expensive and exhaust memory, so canonical mode must not be the default. In Puget, a compromise was to only sort the output if the data structure was below a configurable size, but that doesn't work in CBOR because a set with only two or three huge entries could cause issues just as much as a set with thousands of small entries.

greglook commented 7 years ago

Done in e8e2639a560e48542b20964c756dab5ca927c957.