fritzy / SleekXMPP

Python 2.6+/3.1+ XMPP Library
http://groups.google.com/group/sleekxmpp-discussion
Other
1.1k stars 299 forks source link

Unicode username leads to double entry in roster #88

Closed EliAndrewC closed 13 years ago

EliAndrewC commented 13 years ago

I've got a user andré@foo whose name contains the é character. This user is in the "Unicode" group of the eli@foo user, so when eli@foo looks at his roster while andré@foo is not logged in, the roster entry looks like this when I pprint it:

 u'andr\xe9@dd1': {u'groups': ['Unicode'],
                   u'in_roster': True,
                   u'name': '',
                   u'presence': {},
                   u'subscription': 'both'},

So far so good. However, when andré@foo logs in, I get the following entries in my roster:

 'andr\xc3\xa9@dd1': {u'groups': [],
                      u'in_roster': False,
                      u'name': u'',
                      u'presence': {'062d44f9dab686e2f53d1d20321152452cec2476': {u'priority': 1,
                                                                                 u'show': 'available',
                                                                                 u'status': ''}},
                      u'subscription': u'none'},
 u'andr\xe9@dd1': {u'groups': ['Unicode'],
                   u'in_roster': True,
                   u'name': '',
                   u'presence': {},
                   u'subscription': 'both'},

So instead of andré@foo showing up as logged in, this new guy appears with no group and a slightly different username.

I'm on Python 2.6 (64-bit Ubuntu 10.04), and I'm connecting to a jabberd2 server (v2.2.8). I've tried setting the default encoding to "utf-8" as you do in your example scripts, e.g. echo_client.py

I could manually encode/decode these keys and then merge them in some kind of kludgy way, but I'm wondering if there's a correct, canonical way to do this.

Aside form this, SleekXMPP has been great, thanks so much for putting it out there.

legastero commented 13 years ago

This should be fixed in the develop and roster branches now.

The problem was that the JID class was using str() too liberally, which worked fine with Python 3+ but was accidentally encoding the Unicode values on Python 2.6+.