zeromq / azmq

C++ language binding library integrating ZeroMQ with Boost Asio
Boost Software License 1.0
318 stars 108 forks source link

SUB sockets receiving frequent multipart messages can crash upon disconnection #149

Open sissow2 opened 5 years ago

sissow2 commented 5 years ago

Issue description

This is in relation to https://github.com/zeromq/libzmq/issues/2163, where libzmq will not support disconnecting a socket while receiving a multipart message. This goes pretty deep into the library, and actually extends to any operation that reads a socket's internal queues. This includes zmq_getsockopt(..., ZMQ_EVENTS, ...).

To produce this:

  1. A subscriber is receiving a lot of multipart messages
  2. azmq::detail::socket_ops::reactor_handler::schedule executes (in my case, as a result of an async_receive), and due to the message volume triggers the case that posts the handler to the event queue.
  3. Before the event queue has a chance to handle the handler posted in step 2, the user decides to disconnect the socket. As per the issue in libzmq posted above, this puts the socket into an inconsistent state and it soon crashes.

The actual errors are assertions (!more in fq.cpp) or the next async operation actually just blocking.

I was able to prevent this from happening in my (single-threaded) application with the following patch:

diff --git a/azmq/detail/socket_service.hpp b/azmq/detail/socket_service.hpp
index e4557c8..c874bde 100644
--- a/azmq/detail/socket_service.hpp
+++ b/azmq/detail/socket_service.hpp
@@ -617,7 +617,7 @@ namespace detail {
                 auto evs = socket_ops::get_events(impl->socket_, ec) & impl->events_mask();

                 if (evs || ec) {
-                    impl->sd_->get_io_service().post([handler, ec] { handler(ec, 0); });
+                    impl->sd_->get_io_service().dispatch([handler, ec] { handler(ec, 0); });
                 } else {
                     impl->sd_->async_read_some(boost::asio::null_buffers(),
                                                 std::move(handler));

This needed to be done because get_events calls zmq_getsockopt with ZMQ_EVENTS, which can potentially read the first message of a multipart message- which means the rest of it needs to be handled ASAP. This is not a good solution because it can still break in multi-threaded applications (dispatch will post).

Environment

rodgert commented 5 years ago

dispatch() is probably the right thing to do here instead of post anyway. I'm putting together an update to azmq, I will include this change.