gmr / pamqp

Low level AMQP frame encoding and decoding library
https://pamqp.readthedocs.io
BSD 3-Clause "New" or "Revised" License
50 stars 20 forks source link

Fix UnicodeDecodeError of `long_string` headers #40

Closed dmaone closed 2 years ago

dmaone commented 2 years ago

First of all, I do understand that pamqp is a AMQP 0.9.1 parser. But due to rabbitmq accepting AMQP 1.0 clients via plugin, and python being the only language in the whole world making a stupid distinction between str and bytes, the problem is relevant to pamqp.

AMQP 1.0 clients send headers called x-amqp-properties and x-amqp-app-properties, which cannot be decoded as utf-8. Technically, any client can do that - AMQP 0.9.1 spec paragraph 4.2.5.3 says "Long strings can contain any data", after all - but ~sane people~ most won't.

These headers look like b'\\x00St\\xc1a\\x08\\xa1\\tcfx-topic\\xa1\\x03CFX\\xa1\\x0bcfx-message\\xa1\\rCFX.Heartbeat\\xa1\\ncfx-handle\\xa1\\x19[REDACTED]-----[REDACTED]\\xa1\\ncfx-target@'

pyamqp works around the issue by catching the dreaded UnicodeDecodeError. This PR does the same, and adds tests.

There's another interoperability issue - AMQP 1.0 timestamp is in _milli_seconds, and receiving AMQP-1.0 message over AMQP-0.9.1 connection will raise ValueError('year 53768 is out of range') - but I guess I'll need to monkeypatch for that. (Ironically, I don't even use that timestamp - but it still throws an exception at me and wastes CPU resources decoding that date..)