eandersson / amqpstorm

Thread-safe Python RabbitMQ Client & Management library
https://www.amqpstorm.io/
MIT License
186 stars 36 forks source link

blank encoding in properties generates LookupError #125

Open gabysbrain opened 1 year ago

gabysbrain commented 1 year ago

Setting the content_encoding property to an empty string will generate an exception when a message is encoded.

When sending a message with the following properties (for example)

{content_encoding: '', content_type: '', correlation_id: '', headers: {compression: False, x-delay: 60000}, message_id: '4d042d03fff943cda0a60e9b10175b7d', timestamp: datetime.datetime(2023, 6, 5, 9, 36, 19, 158600)}

Will generate an exception:

LookupError: unknown encoding: 
  File "amqpstorm/message.py", line 181, in publish
    return self._channel.basic.publish(body=self._body,
  File "amqpstorm/basic.py", line 193, in publish
    body = self._handle_utf8_payload(body, properties)
  File "amqpstorm/basic.py", line 354, in _handle_utf8_payload
    body = bytes(body, encoding=encoding)

I think the problem is that an empty string is not a valid encoding and the check for a valid content_encoding is only if the key is missing from properties. https://github.com/eandersson/amqpstorm/blob/9c563cf4cb77e2bce3ae5569239a23fd5084711a/amqpstorm/basic.py#L348-L355

jhogg commented 1 year ago

Tom,

No encoding is UTF-8, a blank/empty string is just that, a blank encoding. Proper validation requires applying the encoding against the unencoded body and since there isn't a requirement that they both be set at the same time, validation (which is expensive in compute terms) is deferred until the message is sent so it only happens once.

A blank encoding is going to generate an exception, the only question is "Where?"

What is your expectation?

Jay

On Mon, Jun 5, 2023 at 8:37 AM Tom Torsney-Weir @.***> wrote:

Setting the content_encoding property to an empty string will generate an exception when a message is encoded (see https://github.com/eandersson/amqpstorm/blob/9c563cf4cb77e2bce3ae5569239a23fd5084711a/amqpstorm/basic.py#L348 ).

When sending a message with the following properties (for example)

{content_encoding: '', content_type: '', correlation_id: '', headers: {compression: False, x-delay: 60000}, message_id: '4d042d03fff943cda0a60e9b10175b7d', timestamp: datetime.datetime(2023, 6, 5, 9, 36, 19, 158600)}

Will generate an exception:

LookupError: unknown encoding: File "amqpstorm/message.py", line 181, in publish return self._channel.basic.publish(body=self._body, File "amqpstorm/basic.py", line 193, in publish body = self._handle_utf8_payload(body, properties) File "amqpstorm/basic.py", line 354, in _handle_utf8_payload body = bytes(body, encoding=encoding)

— Reply to this email directly, view it on GitHub https://github.com/eandersson/amqpstorm/issues/125, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJGJWZYKTMWOMEHNUZLR3XJXOK3ANCNFSM6AAAAAAY27U4IY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gabysbrain commented 1 year ago

Ok, Yeah, I understand it would be complicated to check all encodings before. I just wasn't sure if a blank encoding was meant to be the same as None or actually identify something.

Anyway, I think it would help to have the extension wrapped a few levels up the stack with a more clear message. The reason is that right now you get an exception from the bytes method so you need to go into the amqpstorm source to understand what's happening. It would be easier to have an exception that just says that you passed an invalid content_encoding.

What do you think?