[X] I have searched the existing issues and didn't find my bug already reported there
[X] I have checked that my bug is still present in the latest release
cbor2 version
5.6.2
Python version
3.12.2
What happened?
The documentation on "Customizing encoding and decoding" currently states the follows in the leading paragraph (emphasis mine):
On the encoder side, this is accomplished by passing a callback as the default constructor argument. This callback will receive an object that the encoder could not serialize on its own. The callback should then return a value that the encoder can serialize on its own, ...
This description seems to roughly match the api provided by the default= argument of the json module.
Based on this description, I would have expected the following code to work:
However, in reality, the encoder expects the default callback to instead call encoder.encode(...) to serialise exactly one value. If you forget to do that, you silently end up with an invalid CBOR stream.
Thus, I believe this manifests two related but separate issues:
The documentation for the default= argument is misleading
The current implementation of the serialiser does not detect wrong use of it's API
The fix for the first one is obvious (in fact, further down the documentation, there is an example that shows the right use of the API - otherwise I wouldn't have figured this out at all).
I argue the second issue should be fixed as-well; it seems too easy to accidentally serialise either none or multiple values to the encoder, and the consequences are too dire and too difficult to debug. Because of the misuse potential for the current API, I further believe that the json-like API is in fact better, but it may be too late to change the API at this point for backwards compatibility reasons. In this case, I suggest that the encoder actively count the number of objects serialised within a call to default and raise an exception if that number is not one. Care must be taken when default= gets called recursively though.
Things to check first
[X] I have searched the existing issues and didn't find my bug already reported there
[X] I have checked that my bug is still present in the latest release
cbor2 version
5.6.2
Python version
3.12.2
What happened?
The documentation on "Customizing encoding and decoding" currently states the follows in the leading paragraph (emphasis mine):
This description seems to roughly match the api provided by the
default=
argument of thejson
module. Based on this description, I would have expected the following code to work:However, in reality, the encoder expects the
default
callback to instead callencoder.encode(...)
to serialise exactly one value. If you forget to do that, you silently end up with an invalid CBOR stream.Thus, I believe this manifests two related but separate issues:
default=
argument is misleadingThe fix for the first one is obvious (in fact, further down the documentation, there is an example that shows the right use of the API - otherwise I wouldn't have figured this out at all).
I argue the second issue should be fixed as-well; it seems too easy to accidentally serialise either none or multiple values to the encoder, and the consequences are too dire and too difficult to debug. Because of the misuse potential for the current API, I further believe that the
json
-like API is in fact better, but it may be too late to change the API at this point for backwards compatibility reasons. In this case, I suggest that the encoder actively count the number of objects serialised within a call todefault
and raise an exception if that number is not one. Care must be taken whendefault=
gets called recursively though.How can we reproduce the bug?
See MWE above.