kisielk / og-rek

ogórek is a Go library for encoding and decoding pickles.
MIT License
60 stars 16 forks source link

decoder: long cannot be really used as dict key #55

Open navytux opened 6 years ago

navytux commented 6 years ago

Currently we decode longs to *big.Int and it kind-of works. Also despite == not giving equality for two *big.Int numbers, reflect.DeepEqual works for two such numbers since the numbers are internally always stored in the same form. However not working == poses a problem:

All this in turn leads to that some pickles with long keys are decoded wrongly, e.g.:

# python
In [9]: q = "}(L1L\nS'aaa'\nL1L\nS'bbb'\nu."

In [10]: dis(q)
    0: }    EMPTY_DICT
    1: (    MARK
    2: L        LONG       1L
    6: S        STRING     'aaa'
   13: L        LONG       1L
   17: S        STRING     'bbb'
   24: u        SETITEMS   (MARK at 1)
   25: .    STOP
highest protocol among opcodes = 1

In [11]: pickle.loads(q)
Out[11]: {1L: 'bbb'}

but ogórek gives:

map[interface {}]interface {}{1:"aaa", 1:"bbb"}

Probably nothing we can do here until in Go2 they make builtin int to be really a bigint (https://github.com/golang/go/issues/19623), and then, similarly to Python3, we decode both INT and LONG into Go int, and on encoding Go int, we look into its range and if it is in int32 range -> go INT/BININT/..., and otherwise go via LONG*.

navytux commented 3 weeks ago

I think to the possible extent this issue is fixed by PyDict mode: https://github.com/kisielk/og-rek/commit/3bf6c92de63baa92d07a9e85edab0ae017fd9eb7 (https://github.com/kisielk/og-rek/pull/75).