Open theeldermillenial opened 8 months ago
Are there any updates on this issue? I need a way to encode indefinite length arrays using cbor2.dumps()
and can't tell for sure if there's currently any way to do this.
@T-recks I do have an open PR on this, but it's currently held up by it causing a divergence from the C extension.
https://github.com/agronholm/cbor2/pull/225
I have not had a chance to come back to it to finish it off. Basically, the request was that I modify the C-extension to make sure that the C-extension and the Python implementation are at parity.
@agronholm Now that I'm thinking about it, there might be a hacky way to make sure the C-extension and the Python implementation are identical. It would involve overloading the C-extension class to contain the dictionary mapper. Would that be acceptable? Or do you only want a modification to the C-extension?
Things to check first
Feature description
The decoding mechanism doesn't have a way of differentiating between an indefinite array and a fixed length array such that a round trip decode and encode could be performed without loss of information.
Ideally fixed length arrays are given a different class type than variable length arrays. For example, a fixed length array may be decoded as a
tuple
ordeque
while an indefinite array could be assigned the typelist
.Use case
In
pycardano
, cbor is hashed. If an array of length two is defined as an indefinite array (/x9f
) rather than an array of length 2, decoding then encoding yields a different cbor result, which gives a different hash. This is problematic when verifying cbor contents.Since cbor2 does not distinguish between fixed/indefinite arrays, pycardano creates a custom encoder that is used to create an indefinite array when requested (even if the array is smaller than 30 values). However, there is no analogous functionality with decoding, and trying to subclass CBORDecoder will not be straightforward.
Thus, the ideal implementation would be to differentiate between these two encodings by using different classes. One mechanism could be as described, or alternatively dummy
list
classes could be created to distinguish between the two (e.g. FixedArray and IndefiniteArray, both of which are just lists).I am happy to implement this in any way @agronholm or any other maintainer would like. Just point me in the right direction. The goal is round trip reproduction of cbor regardless of how the array is encoded.