crossbario / autobahn-python

WebSocket and WAMP in Python for Twisted and asyncio
https://crossbar.io/autobahn
MIT License
2.48k stars 768 forks source link

ApplicationError fails when mixing python2 and python3 #607

Open touilleMan opened 8 years ago

touilleMan commented 8 years ago

I've encountered a crash when using a Python2 router with a Python3 client.

Indeed, when the router send a autobahn.wamp.message.Error containing kwargs the client crashes:

Traceback (most recent call last):
  File "/home/emmanuel/projects/autobahn_sync/.tox/py34/lib/python3.4/site-packages/autobahn/wamp/websocket.py", line 90, in onMessage
    self._session.onMessage(msg)
  File "/home/emmanuel/projects/autobahn_sync/.tox/py34/lib/python3.4/site-packages/autobahn_sync/session.py", line 32, in onMessage
    return super(_AsyncSession, self).onMessage(msg)
  File "/home/emmanuel/projects/autobahn_sync/.tox/py34/lib/python3.4/site-packages/autobahn/wamp/protocol.py", line 921, in onMessage
    txaio.reject(on_reply, self._exception_from_message(msg))
  File "/home/emmanuel/projects/autobahn_sync/.tox/py34/lib/python3.4/site-packages/autobahn/wamp/protocol.py", line 248, in _exception_from_message
    if msg.kwargs:
TypeError: __init__() keywords must be strings

The trouble is this kwargs's keys are made of bytes (i.e. python2's str) but python3 wants unicode (python3's str) to use them as function parameters

(Pdb) msg.kwargs
{b'message': 'event history for the given subscription is not available or enabled'}

The WAMP spec (https://tools.ietf.org/html/draft-oberstet-hybi-tavendo-wamp-02#section-5.2) only define the utf8-encoded string, I guess then everything should be in UTF-8. Given there is no trouble when using Python3 everywhere, that mean the data's encoding depends of the sender which I believe is a bug both at serialization (should only send data in one encoding) and deserialization (should only accept data in one encoding) time.

touilleMan commented 8 years ago

No idea about this ?

oberstet commented 8 years ago

It's a bug.

Apparently, parameter names in Python 2 are bytes, while strings (real unicode) in Python 3. Somehow I missed that (or forgot about again).

Anyways. The WAMP ApplicationSession.call and ApplicationSession.publish need to go over the kwargs they get, and on Python 2 convert the keys to unicode strings - and reverse that before derefencing when invoking the call handler or event handler.

Analysis:

(cpy2711_11) oberstet@thinkpad-t430s:~/foo$ python test3.py 
('a', <type 'str'>, <type 'unicode'>)
('b', <type 'str'>, <type 'str'>)
('c', <type 'str'>, <type 'str'>)

versus

(cpy351_4) oberstet@thinkpad-t430s:~/foo$ python test3.py 
a <class 'str'> <class 'str'>
b <class 'str'> <class 'bytes'>
c <class 'str'> <class 'str'>
(cpy351_4) oberstet@thinkpad-t430s:~/foo$ cat test.py 
from autobahn.wamp import serializer
import binascii

serializers = []

serializers.append(serializer.JsonSerializer())
serializers.append(serializer.MsgPackSerializer())
serializers.append(serializer.CBORSerializer())
serializers.append(serializer.UBJSONSerializer())

def foo(**kwargs):
    for k in kwargs:
        print(type(k), type(kwargs[k]))
    for ser in serializers:
        data = ser._serializer.serialize(kwargs)
        print(binascii.b2a_hex(data).decode())

foo(a=3, b=2)
(cpy351_4) oberstet@thinkpad-t430s:~/foo$ cat test2.py 
from autobahn.wamp import serializer
import binascii

data_py2 = [
'7b2261223a332c2262223a327d',
'82c4016103c4016202',
'a2416103416202',
'7b550161550355016255027d'
]

data_py3 = [
'7b2262223a322c2261223a337d',
'82a16202a16103',
'a2616202616103',
'7b550162550255016155037d'
]

serializers = []

serializers.append(serializer.JsonSerializer())
serializers.append(serializer.MsgPackSerializer())
serializers.append(serializer.CBORSerializer())
serializers.append(serializer.UBJSONSerializer())

for data in [data_py2, data_py3]:
    for i in range(len(serializers)):
        d = binascii.a2b_hex(data[i])
        ser = serializers[i]
        dd = ser._serializer.unserialize(d)
        print(dd)
(cpy351_4) oberstet@thinkpad-t430s:~/foo$ cat test3.py 
# coding: utf-8

def foo(**kwargs):
    for k in sorted(kwargs):
        print(k, type(k), type(kwargs[k]))

foo(a=u'foo', b=b'foo', c='foo')
(cpy351_4) oberstet@thinkpad-t430s:~/foo$