irmen / pickle

Java and .NET implementation of Python's pickle serialization protocol
MIT License
78 stars 5 forks source link

Upper unicode points not supported #2

Closed platnumkid closed 3 years ago

platnumkid commented 3 years ago

net.razorvine.pickle.PickleException: invalid escape sequence char 'U' in string "seth rollins \U0001f455 [...]" (possibly truncated)

Python pickles any unicode char above \uffff to \U0001xxxx.
decode_unicode_escaped() doesn't currently support case: 'U' and throws. Can this support be added?

irmen commented 3 years ago

I'll see what I can do, sounds like a simple fix. Thanks for reporting.

irmen commented 3 years ago

@platnumkid in the meantime, don't use pickle protocol 0. Try using one of the recent protocols. For instance on Python3.8 we're at protocol 4. These encode differently and more efficiently.

platnumkid commented 3 years ago

Thanks!

irmen commented 3 years ago

Just released pickle 1.1 java and .net version, with the \U fix .