nengo / nengo-extras

Extra utilities and add-ons for Nengo
https://www.nengo.ai/nengo-extras
Other
5 stars 8 forks source link

Safe unpickling of utf8 and latin1 encodings in Python3 #49

Open hunse opened 7 years ago

hunse commented 7 years ago

This allows Python2 pickle files to have strings encoded as either utf8 or ascii (latin1).

The only disadvantage is that it uses the pure-python pickle implementation, rather than the faster C-based _pickle. In practice, though, this doesn't seem much slower.

I thought that I should be able to avoid doing this by setting the errors argument on pickle.load. This gets passed to codecs.decode, and controls what happens if an error happens when trying to decode a string. However, whatever I set as the value of errors, I always get this exception: ValueError: Failed to encode latin1 string when unpickling a Numpy array. pickle.load(a, encoding='latin1') is assumed.

I even tried defining my own error handler with codecs.register_error, but I can't seem to get around that exception.

Anyway, should we go ahead with this? It seems like a bit of a hack, but it does let things work nicely with pickle files that use either encoding.