Closed mxmauro closed 3 years ago
I'm not following.
Canonical should work for interface{} keys, and you should not have to do anything special.
Please attach a reproducer so I can reproduce your problem, or point out how to fix it (if not a bug).
Ping me on this, using @ugorji and I will follow-up (including re-opening the issue if it is a bug).
Hi @ugorji I created an example:
The code decodes and reencodes a msgpack-based block of the Algorand's blockchain and the result is different than the expected.
Things to take into account:
map[string]interface{}
(which works almost all the time) for decoding because this particular block has an internal struct of type map[uint64]...
block
and cert
. When reencoding, the cert
is added first because the code is unable to sort items.Regards, Mauro.
@mxmauro
Thanks for the reproducer.
That's a tough test.
For pragmatic reasons, we haven't been disciplined in maintaining the format over bug fixes, by nature of bug fixes e.g. canonical previously would treat all numeric types by looking only at the kind, ignoring the fact that a user might have created extensions for it, or implemented binaryMarshaler or codecSelfer or otherwise. We fixed that bug. Consequently, on reencoding, the stream may be more correct.
Canonical is not used by most users - meaning it didn't get the majority of users using the support and finding issues, especially across the different paths of reflection and code-generation. I think it's in a much better place now, and as bug fixes in it reduce, the encoded stream given a set of EncodeOptions should stay consistent.
Consequently, a more complete test would be to validate that, on decoding, it's the same input object, and on re-encoding, its the same byte stream. i.e.
Hope this helps.
BTW, I'm interested in your notes above
Hi @ugorji you are welcome.
The reason for canonical is because, you generate a hash of the encoded data and that becomes the ID of the block, transaction, whatever, and that cannot change.
About the map of the map, see these links:
https://github.com/algorand/go-algorand/blob/0111cb102c870eb5c107479e13c34cb6d62cb32c/data/transactions/teal.go#L36 https://github.com/algorand/go-algorand/blob/0111cb102c870eb5c107479e13c34cb6d62cb32c/data/basics/teal.go#L73
The comments above each marked line contains the description of the key.
Argh ... tough. Completely agree ... that CANNOT change.
Two things I updated in the code:
If you didn't implement extensions and you didn't those interfaces, then canonical shouldn't change due to the bug fixes. However, your omitemptyarray is automatically honored by omitempty.
Question for you: Is there a reason why you added a new flag (omitemptyarray) as opposed to just extending the support of omitempty?
Hi Ugorji, little story
I had to deal with a scenario where I have to decode a map inside a msgpack, remove one item, and reencode it. But the map had nested maps, some using
string
and others usinguint64
as the key types.Because of this, I was forced to use
map[interface{}]interface{}
but then I was unable to encode it again using the canonical sort because the code does not handleinterface{}
keys.I ended doing this piece of code that might help anyone: