zeromq / azmq

C++ language binding library integrating ZeroMQ with Boost Asio
Boost Software License 1.0
319 stars 109 forks source link

Fix leak of remaining pending operations during io_service destruction #167

Closed dkl closed 2 years ago

dkl commented 3 years ago

Previously, ops were tracked in boost::intrusive::lists and deleted by unlinking from the list and calling do_complete, which did the delete. However, any remaining ops in the queues during destruction of per_descriptor_data were not freed, because boost::intrusive::list destruction only unlinks and does not free the nodes.

This patch refactors the op/queue memory management to fix the leak, and separates the memory management from the do_complete() action. Now, ops are tracked with std::deque + std::unique_ptr everywhere. They no longer delete themselves in do_complete(), but instead we rely on automatic destructor calls to free the memory. do_complete() calls are still done in the same place as before, but now they only call the completion handlers and leave the freeing to the destructors.

(this should fix https://github.com/zeromq/azmq/issues/165)

aboseley commented 3 years ago

Thank-you for this PR and test cases!! I've also seen the valgrind issues your PR address but I'm concerned about the performance implications of the proposed solution for normal operation. Do you have any thoughts on the performance effects?

Regardless, if no alternative fix is provided by community I'm inclined to merge this fix, since correctness > speed .

dkl commented 3 years ago

Normally I wouldn't expect this to make much of a difference for performance, since it affects just the management infrastructure for send/recv operations, but not the data that's transferred. But who knows, maybe it matters for small packets...

I haven't done any benchmarks for this, but I'll try to do something, and report back.

dkl commented 2 years ago

Here are some benchmark results I got today (scripts are here: https://github.com/dkl/azmq-benchmark):

buffersize 8: patch 7.3% slower than original
buffersize 1 KiB: patch 6% slower than original
buffersize 1 MiB: patch 1.7% faster than original
buffersize 10 MiB: patch 1.6% faster than original
Tested with: sending data from one socket to another over "inproc", 100 x 10sec runs per buffersize & patch combination             

I'm honestly not sure how much to trust these results, especially the last two. It seems logical that the additional overhead would have a negative impact, which should be especially noticable with small packets, but I don't know how it could make things faster with big packets.

aboseley commented 2 years ago

Running your bench mark test on an oldish x86_64 linux laptop, I'm getting a consistent 2.5% throughput reduction across all packet sizes.

aboseley commented 2 years ago

Thanks for putting these test together. I'm OK with this minor performance hit to fix the memory issues, hopefully others agree.