fvicent / attrs2bin

Binary serializer for attrs-based classes
https://pypi.org/project/attrs2bin/
MIT License
7 stars 0 forks source link

Deserialization performance #1

Open fungs opened 1 year ago

fungs commented 1 year ago

Hi @fvicent,

thanks for this add-on for attrs. I tested it to serialize and deserialize attrs data objects of different sizes. I saw, that it works quite fast for the serialization part, but deserialization is considerably slow as the objects grow. I used a class with a bytes-type object and started increasing the size of this binary blob attributes. Logically, binary objects should not need a lot of treatment for serialization and deserialization.

What I observed is, that most of the time on deserialization in serializers.py is spent in the underlying deque function popnleft. I assume, that the data is copied implicitly creating a computational and memory overhead. Maybe you want to take a look at the current implementation and see, if copying can be avoided.

Cheers Johannes

fvicent commented 1 year ago

Hi there! I was afraid this issue might arise since popnlef() is just a wrapper that calls collections.deque.popleft() n times. That was a quick workaround that met my needs when writing this library, since I was working with small objects. So the performance is unavoidably O(n) with the current implementation. A real implementation of popnleft() which does not depend on collections.deque.popleft() would be required.

Honestly, I can't work on this anytime soon, but your feedback is much appreciated.

fungs commented 1 year ago

Great, I just wanted to leave this here to be known after testing. I'll probably go with a completely different serialization strategy for my streaming application.