Ledest / ecbor

Erlang CBOR library
GNU General Public License v3.0
3 stars 2 forks source link

[Feature Request] Ordered Representation of CBOR decoding maps (rather than map representation). #2

Open daidoji opened 7 months ago

daidoji commented 7 months ago

Love the library. We have an application that uses CBOR and need to deserialize CBOR maps to an ordered representations (similar to what you're doing at intermediary steps in the code). Similar to Jason.OrderedObject (Elixir JSON library) or erlang-msgpack deserializing to jsx or jiffy ordered property lists.

Glad to push a PR if you can give us direction. We've been programming in Elixir but have some erlang experience.

Ledest commented 7 months ago

Map is an unordered type (by definition). You can serialize list of {Key, Value} to CBOR. Or use maps:iterator/2 (Erlang/OTP >= 26) for the desire order after deserialization from CBOR.

daidoji commented 7 months ago

Agreed completely. Maps are unordered by definition. We're operating in a context where ordered maps are a requirement to get the cryptography to work out unfortunately and ordered maps have become a requirement in serialization and deserialization.

Also this feature would be for us as we have to consume payloads that write maps in fixed order (insertion order to be exact) within those same security contexts.

So we get a cbor payload on the wire and we need to read that map in the order it was serialized. This feature request of a proplist (similar to these other libraries) would help us to do that. We're glad to help with a PR if you give us direction on how to use maps:iterator to achieve that. It seems like ordered in that function always does map key order and what we want is the order it comes off the stream.

Ledest commented 7 months ago

According RFC8949:

The CBOR data model for maps does not allow ascribing semantics to the order of the key/value pairs in
the map representation. Thus, a CBOR-based protocol MUST NOT specify that changing the key/value pair
order in a map changes the semantics...

So... I don't know :(

Ledest commented 7 months ago

And I forgot that ecbor encode maps (I hope) to CBOR according RFC8949's 4.2.1 :)

daidoji commented 7 months ago

And I forgot that ecbor encode maps (I hope) to CBOR according RFC8949's 4.2.1 :)

That might be difficult with the Indefinite length maps. The deterministic encoding precludes such maps. https://datatracker.ietf.org/doc/html/rfc8949#section-4.2.1-2.2

That being said, I completely agree with you that the CBOR data model ascribes to maps no particular ordering in the general scheme and map/key ordering in the deterministic scheme See https://datatracker.ietf.org/doc/html/rfc8949#section-4.2.1-2.3.1

However, our protocol suite unfortunately needs arbitrary ordering that isn't in lexographical order. Other libraries preserve order (and don't use the deterministic encoding described in the RFC) out of artifacts of those languages (like python) but unfortunately these are the reference implementations we're trying to build against.

Would you take a PR allowing it under consideration? We really love the work otherwise. It seems like all the other elixir/erlang cbor libraries have been a bit abandoned/stale for a while now.

Ledest commented 7 months ago

However our protocol suite unfortunately needs arbitrary ordering that isn't in lexographical order.

So your protocol suite unfortunately does not match CBOR stadards :( Maybe you want any "my-cool-CBOR" format and use my_cool_ecbor library (which you can do by forking the ecbor) :) Or you can use some kind of flat array (indefinite-length like list (`[Key1,Value1,key2,Value2,...]) or definie-length like tuple) - I don't know... Or you can try to enhance RFC8949 :)

Two main goals of ecbor: 1) maximum compliance with standarts; 2) AnyErlangTerm = ecbor:decode(ecbor:encode(AnyErlangTerm))

Support of "custom non-Erlang types with custom tags and custom encode fun" may be interesting, but it is non-trivial and not currently planned.

Would you take a PR allowing it under consideration?

If something can be implemented without breaking or expanding the standard - welcome.

Ledest commented 7 months ago

That might be difficult with the Indefinite length maps.

ecbor doesn't make "indefinite length maps" encoding.

daidoji commented 7 months ago

Sorry, I meant indefinite length lists.

Its not our encoding unfortunately. All sane people would choose ASN.1 and DER outside of people who write encodings for hard real time IoT networks and even then that's probably still the right choice. These constraints have been imposed on us from the spec author :-(

We will probably fork though if you don't think its appropriate. In a cryptographic protocols canonical representations end up being super important. This is kinda the same reason tags are out. Since the reference implementation doesn't use tags our signatures would be all out of wack. The spec author just happened to choose a very annoying one (the order of things as they exist on the wire) for every language that isn't javascript and python imo. We've had the same difficulties with messagepack and json.

However, thank you for writing the library and making it GPL3. Its a very nice implementation imo. If we can find PRs to help you along we'll try to remember to push them back here.