majek / puka

Puka - the opinionated RabbitMQ client
https://github.com/majek/puka
Other
182 stars 34 forks source link

Supplying Unicode string for basic_publish raises an exception. #4

Open majek opened 13 years ago

majek commented 13 years ago

It should just work.

grncdr commented 12 years ago

I was bit by this today when my application code was generating unicode routing keys, even if calling str(routing_key) unconditionally is a bad idea (I admit I'm not very familiar with intricacies of handling string encodings in Python) throwing a more informative error earlier than connection._send_frames would be a big improvement.

majek commented 12 years ago

In the commit fc6f87d29a3f860dbe15e11ecc99f1c0f07a9b40 I made an attempt to automatically convert all strings to and from utf-8. (except properties and tables, where unicode support is not updated here).

It's kind of okay, but:

  1. Puka will be slower.
  2. I'm not entirely sure it's correct. What happens if someone else actually defined a queue which is not a valid utf-8 string? With this patch puka will fail.

Actually, I'm less sure that Puka should deal with unicode at all.

pr3d4t0r commented 10 years ago

@majek - we are bitten by those issues too. We're getting around it by normalizing all strings going into puka using str() and using an abstract wrapper around it:

    def send(self, message):
        self._client = puka.Client("amqp://"+str(self._messagingServer))
        promise      = self._client.connect()

        self._client.wait(promise)

        promise = self._client.queue_declare(queue = self._messageQueue)
        self._client.wait(promise)
        promise = self._client.basic_publish(exchange = "", routing_key = self._messageQueue, body = str(message), mandatory = True) 
        self._client.wait(promise)
        promise = self._client.close()
        self._client.wait(promise)

It would be nice to normalize everything to UTF-8/Unicode inside puka given the international nature of most applications nowadays. str()/ASCII should be the exception, not the rule, in our opinion (we've dealt with a lot of i18n implementations in the last 7 years).

Thoughts?

majek commented 10 years ago

Here's the problem: I could run str() within puka but this would slow it down significantly! Puka does need to work on raw strings (ie: not unicode), as it needs to calculate byte lengths etc.

So in balance, unfortunately, it's "cheaper" for me to do nothing and get users to supply strings, as opposed to normalizing all the time and slow down puka...