jrief / django-websocket-redis

Websockets for Django applications using Redis as message queue
http://django-websocket-redis.awesto.com/
MIT License
894 stars 221 forks source link

Unicode support #217

Open sylvainlb opened 7 years ago

sylvainlb commented 7 years ago

I'm using django-websocket-redis in a non-English environment. I try to publish a message: RedisPublisher(facility='foobar', broadcast=True).publish_message(RedisMessage(message)) I get an UnicodeEncodeError.

Why is RedisMessage based on six.binary_type and not six.text_type? Is there a way to publish a unicode string?

Thanks for the great work

sylvainlb commented 7 years ago

For those facing the same issue, here's how I dealt with it: I escaped the string before creating the message: RedisMessage(message.encode("unicode_escape")

And on the client side, I had to rebuild it:

function unescapeUnicode(str) {
    var r = /\\x([\d\w]{2})/gi;
    x= str.replace(r, function (match, grp) {return String.fromCharCode(parseInt(grp, 16))});
    return unescape(x);
}

I'm still not sure why RedisMessage doesn't support Unicode though.

nanuxbe commented 7 years ago

Hello Sylvain,

having contributed to the py3 compatible layer, I think I remember (that was a while back now) that binary is what is (or at least was) expected upstream by the python redis library. I'm not sure if it's a limitation of the python library or redis itself though.

If you want to experiment, feel free to replace six.binary_type by six.text_type and see if it breaks, "native" unicode support would be really nice.

If you are using python2,I think your implementation might be the best workaround but I would suggest moving on to py3 if you're making heavy use of unicode.

If you're using py3, I'm curious of what language was that message was written in (and more specifically what was its content) because RedisMessage is already encoding str https://github.com/jrief/django-websocket-redis/blob/master/ws4redis/redis_store.py#L71 and if there is something that can be done to make things smoother, it would be interesting to try to fit it directly in the library. maybe something as simple as:

try:
    value = value.encode()
except UnicodeEncodeError:
    value = value.encode("unicode_escape")
sylvainlb commented 7 years ago

thanks @nanuxbe ! Indeed I'm using python2 because of modules that still aren't ported to py3. I was thinking maybe redis was expecting binary, but I didn't see why.

I got it working on my use case by replacing binary_type by text_type and adding a decode on the list case: return six.text_type.__new__(cls, value[2].decode("utf8"))

I'm not sure why I needed to decode what was coming from the parse_response and I couldn't test it on py3, but maybe that can help.