Open dkl opened 1 year ago
Any plans to work on this finding? I have also been caught by this one.
This behaviour absolutely needs to be fixed because it violates [Asio guarantees](https://www.boost.org/doc/libs/1_83_0/doc/html/boost_asio/overview/core/threads.html#:~:text=Asynchronous%20completion%20handlers%20will%20only%20be%20called%20from%20threads%20that%20are%20currently%20calling%20io_context%3A%3Arun()):
_"Asynchronous completion handlers will only be called from threads that are currently calling
io_context::run()
_
@Degoah conjectures here simply skipping cancel_ops()
appears to maybe fix the surface symptom:
https://github.com/zeromq/azmq/blob/master/azmq/detail/socket_service.hpp#L483
However, I'm not qualified to judge whether this is okay w.r.t. lifetimes (as the descriptor is being unregistered - which may or may not be okay?).
Also, note it doesn't solve the issue that the executor is not being honored as it should according to the Asio specs:
https://compiler-explorer.com/z/W83xMGqve, which (with the cancel_ops()
removed as described) outputs
That's still wrong, as the azmqsocket callback should be invoked ON the strand.
However, I'm not qualified to judge whether this is okay w.r.t. lifetimes (as the descriptor is being unregistered - which may or may not be okay?).
What I have understood from the implementation is, that de-registering the descriptor is ok, as it gets implicitly registered again in context of a new call to async_receive/async_send and an internal call to schedule(...).
Hi,
socket::cancel()
appears to call pending completion handlers immediately, instead of delaying their execution until after the call returns. As a result there are dead locks when calling more socket operations (such asasync_send()
) from the completion handlers, because thesocket
uses a non-recursive mutex internally. This differs from other boost::asio objects such asboost::asio::steady_timer
, which allow this case, so for example you can restart a timer from inside theoperation_aborted
completion handler.An example to show the issue:
Actual output, azmq socket competion handler called during cancel(), instead of later like the others:
Expected output: All completion handlers are called later through the io_service.