qweeze / rstream

A Python asyncio-based client for RabbitMQ Streams
MIT License
83 stars 13 forks source link

AMQP 1.0 Parser and Cython code acceleration #29

Closed Gsantomaggio closed 6 months ago

Gsantomaggio commented 1 year ago

At the moment, we include all the AMQP 1.0 QPID with Azure AMQP 1.0.

We should extract only the AMQP parser avoiding all the client

Gsantomaggio commented 1 year ago

cc @danielepalaia

DanielePalaia commented 1 year ago

This issue also to see if we can do some tests to improve the performance of the parse using Cython

DanielePalaia commented 1 year ago

I made a few tests and exploration on the https://github.com/Azure/azure-uamqp-python library.

This library already uses cython acceleration.

There is a section written in python (inside uamqp folder) which is a wrapper of the cython library inside src (there are .pyx files there). The library is also wrapping a pure C library https://github.com/Azure/azure-uamqp-c.

Anyway despite this the library seems very slow during instantion of Message objects and encoding/decoding.

There are few steps that are very inefficient but I'm not sure how these can be improved.

The struct Message defined in Message.py seems to be a wrapper of c_uamqp.Message. During every encoding operation a cloning of this wrapped object c_uamqp.Message happens. This operation really takes a lot of time during enconding.

Also there is a section in util.py:

data_factory(value, encoding='UTF-8'): """Wrap a Python type in the equivalent C AMQP type.

This wrapping conversion seems to take a lot of time especially when complex structure like dictionary are passed.

Also the get_encoded_message_size which written in cython seems having few redundant operations.

Given the three layers this library is composed we need to undestand at this point if it is better to create a new one or just refactor this one.

A simple python code (without cython accelleration) as in this branch: https://github.com/qweeze/rstream/tree/uamqp_test which is just encoding a body is giving already better performance of the uamqp library with a body encoded by a lot.

DanielePalaia commented 6 months ago

I'll close this issue after https://github.com/qweeze/rstream/pull/194

This can be continued if necessary in the Performance issue https://github.com/qweeze/rstream/issues/30