zeromq / zeromq4-1

ZeroMQ 4.1.x stable release branch - bug fixes only
GNU General Public License v3.0
125 stars 137 forks source link

Large traffic when on zmq_pool #162

Open ludekvodicka opened 7 years ago

ludekvodicka commented 7 years ago

I'm not sure if this is a bug or intended behaviour, but I see a large traffic in windows Resource Monitor when calling zmq_pool in the loop (without any incomming connection)

    void * m_serverSocket{ nullptr };

    m_serverSocket = zmq_socket(zmq_ctx, ZMQ_REP);
    XASSERT(m_serverSocket != nullptr);

    //bind address to socket
    int rc = zmq_bind(m_serverSocket, "tcp://*:9990");
    XASSERT(rc == 0);

    zmq_msg_t msgRequest;
    rc = zmq_msg_init(&msgRequest);
    XASSERT(rc == 0);

    zmq_pollitem_t items[] = { m_serverSocket, 0, ZMQ_POLLIN, 0 };

    while ( true )
    {
        rc = zmq_poll(items, 1, 1000);
        if ( rc == 0 )
            continue;

        XASSERT(rc > 0 );

        if ( items[0].revents & ZMQ_POLLIN )
            break;
    }

This is what I see in Resource Monitor:

image

When I have more zmq_pools in the app or more complex app, I'm getting much bigger numbers:

image

All this bandwidth is caused only by waiting for the incoming transaction.

ludekvodicka commented 7 years ago

It seems that problem is caused by timeout value. When I try the same code but instead I choose -1 as timeout value, there is almost no traffic.

image

rc = zmq_poll(items, 1, -1);

But now I'm not able to correctly stop the zmq_poll

bluca commented 7 years ago

are you really sure that's bandwidth and not windows counting file descriptors polling as traffic, for whatever windows-y reason?

ludekvodicka commented 7 years ago

To be honest I'm not sure. And unfortunately I'm not sure how to check it.

But if you compare image from zmq_poll(items, 1, -1); and zmq_poll(items, 1, 1000) there are different numbers in Send B/sec, Receive B/sec, and also Bandwidth I/O so I think it's some kind of communication.

Do you have any idea how to check it?

bluca commented 7 years ago

There is no communication happening in a poll - just poll or select (not sure which one is used on windows, you can check in the configuration log), as you can see in the code: https://github.com/zeromq/zeromq4-1/blob/master/src/zmq.cpp#L678

The difference will most likely be because with -1 it blocks forever, while with a timeout it exits and re-enters continuously, thus calling select or poll all the time

ludekvodicka commented 7 years ago

But 1000 timeout should re-enter every second, right? So this is a huge traffic for it ;-)

I will try to implement exit from zmq_pool() as second control socket.

bluca commented 7 years ago

Ah now that I remember - on Windows, due to the lack of other APIs that support polling/select, the internal communication between threads uses TCP rather than IPC like on *NIX. So your resource monitor might be simply picking up traffic on localhost - in practice for any serious implementation this should make no difference, as the packets never leave the kernel networking loopback stack, so it should have just minor overheads compared to classic IPC.

In general again it's nothing to worry about.

See this old discussion for more details:

http://grokbase.com/t/zeromq/zeromq-dev/119vymt31j/zmq-occupies-random-tcp-ports-on-windows

ludekvodicka commented 7 years ago

Thanks for links and explanation. Still, the weird thing is that this behaviour happens only when zmq_poll has a timeout value. But with "-1" value is everything ok, but also in -1 case there has to be some communication inside.

ludekvodicka commented 7 years ago

Just tried to rewrite stop logic to second socket and I can confirm that in this case numbers are much betters

image

These values are for three separated listeners (51Kbps now vs 48MBps before). In case that anyone else will be interested in this solution, here is a short example:

I'm using following code to wait for stop flag


    m_serverStopSocket = zmq_socket(Atomix::CZeroMQSupport::Instance().GetContext(), ZMQ_PAIR);
    m_serverStopCommandSocketName = "inproc://" + axUuidHelper::GenerateUuid();
    rc = zmq_bind(m_serverStopSocket, m_serverStopCommandSocketName);

        ....

    zmq_pollitem_t items[] = { 
        { m_serverSocket, 0, ZMQ_POLLIN, 0 },
        {m_serverStopSocket, 0, ZMQ_POLLIN, 0}
    };
       rc = zmq_poll(items, 2 /*items*/, -1 /*infinite*/);

and this is how I invoke stop flag:

       //STOP command
    void *stopSocket = zmq_socket(Atomix::CZeroMQSupport::Instance().GetContext(), ZMQ_PAIR);
    zmq_connect(stopSocket, m_serverStopCommandSocketName);
    zmq_send(stopSocket,"DONE",4,0);
    zmq_close(stopSocket);