seznam / fastrpc

FastRPC library
http://seznam.github.io/frpc
GNU Lesser General Public License v2.1
46 stars 46 forks source link

Python 2.7 binary data and strings don't handle same way #35

Open KeNaCo opened 7 years ago

KeNaCo commented 7 years ago

Hi, we found yesterday problem with binary data handling.

How to reproduce:

When use explicit fastrpc.Binary type, everything works well. Possible reason is here:
https://github.com/seznam/fastrpc/blob/master/python/pythonfeeder.cc#L212-L214 When str type is used fastrpc assume utf-8 which is wrong by design of python 2.x where type(b'') == str -> True

We also try explicitly define stringMode='string' when loads, dumps don't support this option

In [44]: gif = 'GIF89aX\x02X\x02\xf4\x00\x00\xb6\xb0\x9e\x8c\x88y\xa7\xcf\xd0Z\xb5\xd4speY\xa1\xb8\xcb\xc5\xb1\xf7\xa5M\xa0\x9c\x8c\xa9l-\xa5\xa7\xa2\xe8\xe0\xc7\xbf\xd9\xd2\x81\xb4\xbe\x93\x8d~\xdc\xd6\xc0/v\x91\xf1\xc5\x8d\xa6\x82T9\x90\xb0\xee\xd5\xad\xda|\x1bt\xc3\xdb-..\xd4\xdf\xd0@\xa6\xcc\x80\xa1\xa1\x1d\x87\xb1\xfe~\x00'

In [45]: from fastrpc import Binary, loads, dumps

In [46]: bgif = Binary(gif)

In [47]: loads(dumps((gif, ), useBinary=True), useBinary=True, stringMode="string")[0] == gif
Out[47]: True

In [48]: loads(dumps((bgif, ), useBinary=True), useBinary=True)[0].data == bgif.data
Out[48]: True