zeromq / libzmq

ZeroMQ core engine in C++, implements ZMTP/3.1
https://www.zeromq.org
Mozilla Public License 2.0
9.8k stars 2.36k forks source link

BUG to monitor ZMQ_EVENT_CONNECTED #4736

Open IvanShipaev opened 2 months ago

IvanShipaev commented 2 months ago

Issue description

If you monitor the connection settings and immediately send data when connecting, then in most cases the data does not reach the recipient, despite the fact that the zmq_socket_monitor method generates the ZMQ_EVENT_CONNECTED event.

If, after receiving the ZMQ_EVENT_CONNECTED event, you wait at least 1 millisecond, then data reaches the recipient.

Environment

Minimal test code / Steps to reproduce the issue

#include <zmq_addon.hpp>
#include <thread>

class Monitor : private zmq::monitor_t {
    bool isConnect_ {false};
    void on_event_connected(const zmq_event_t& event, const char* addr) override
    {
        isConnect_ = true;
        printf("Got connection from %s %u:%u\n", addr, event.event, event.value);
    }
public:
    explicit Monitor(zmq::socket_t &socket, const std::string &addr)
        : zmq::monitor_t()
    {
        init(socket, addr, ZMQ_EVENT_CONNECTED);
    }

    bool wait(unsigned timeout)
    {
        check_event(timeout);
        return isConnect_;
    }
};

int main(int argc, char *argv[])
{
    printf("zmq_version %u.%u.%u\n", ZMQ_VERSION_MAJOR, ZMQ_VERSION_MINOR, ZMQ_VERSION_PATCH);
    zmq::context_t ctx;
    std::array<zmq::const_buffer, 2> send_msgs = {
        zmq::str_buffer("topic") ,
        zmq::str_buffer("hello")
    };
    zmq::socket_t sock(ctx, zmq::socket_type::pub);
    Monitor monitor(sock, "inproc://pub-socket");
    sock.connect("tcp://127.0.0.1:9210");
    if (monitor.wait(1000)) {
        //std::this_thread::sleep_for(std::chrono::milliseconds(1));  /// BUG to fixed
        auto ret = zmq::send_multipart(sock, send_msgs);
        if (!ret) {
            printf("Error ret\n");
            return -1;
        }
        printf("Send[%u]\n", *ret);
    }
    return 0;
}

What's the actual result

  1. From the sender’s side, it doesn’t matter whether the “BUG to fixed” line is commented out or uncommented
    zmq_version 4.3.6
    Got connection from tcp://127.0.0.1:9210 1:15
    Send[2]
  2. On the recipient side listening to the SUB socket, the data usually does not reach the recipient if the “BUG to fixed” line is commented out, but sometimes (extremely rarely) the data does reach.

One gets the feeling that somewhere inside ZMQ there is a race condition, while the monitor issues the ZMQ_EVENT_CONNECTED event, but the socket itself is not yet in the connected state.

What's the expected result

Data should always be transferred and there should be no race conditions. Otherwise, it is not clear how to control the CONNECT event from the publisher side and why is “zmq_socket_monitor” needed at all if it cannot be trusted.