zeromq / cppzmq

Header-only C++ binding for libzmq
http://www.zeromq.org
MIT License
1.94k stars 757 forks source link

is send() from the multipart_t intentionally broken? #117

Open barczynsky opened 7 years ago

barczynsky commented 7 years ago

Let's assume below fragment of code was taken from zmq_addon.hpp file:

    // Send multipart message to socket
    bool send(socket_t& socket, int flags = 0)
    {
        flags &= ~(ZMQ_SNDMORE);
        bool more = size() > 0;
        while (more)
        {
            message_t message = pop();
            more = size() > 0;
            if (!socket.send(message, (more ? ZMQ_SNDMORE : 0) | flags))
                return false;
        }
        clear();
        return true;
    }

First thing, this code works, when there are no errors on the transmission channel itself, and all sent messages/parts were received in proper manner and the other end. Let's also assume socket.send() returns true when everything went fine, and returning false means the message might have been sent, but surely not received at the other endpoint. Which in case of previously sent ZMQ_SNDMORE flag leaves the receiving endpoint in an ambiguous state for short, but undefined period of time, from the perspective of the sender (since it returns immediately after).

Did someone say undefined behavior on a higher layer of abstraction?

In case the socket.send() fails, we have one popped message on stack (destroyed after the next line return), and some leftover messages in the m_parts. It's totally possible to invoke multipart_t::send() again immediately after the return, on the leftover data, but absolutely no way to tell the C-API, that this is a "continuation" of previously interrupted transmission (if that's even possible, mainly at the receiving side). It is simply because of the first line in the function body, which disallows that flag to propagate further.

The laziest, and also the worst (in my opinion), way to solve this situation, is to break out of the while loop, and actually use invoked clear() method. Since this would be the only case in which calling clear() here would change anything.

More elegant solution, would be to push the message from local scope back to the front of the m_parts, and then return false. Of course that's in case we want to keep any leftover message parts, as it may give some feedback to the sender (also because we don't lose the message from the while loop scope), about which data might've been actually sent already.

This may also be, of course, the canonical situation when the whole transmission has failed, and the receiving socket properly discarded all previously received message parts. But, it may also be a situation when the communication channel was struck by an unknown space radiation waves, and while the message was send successfully, in vanished on the way and temporary quantum disturbance disallows any form of communication with the other endpoint, to ask it for e.g. an ACK? It's still worth choosing some solution instead of choosing none and leaving it as it is probably.

Also, those are just some minor notes on the what-ifs of the networking depths with zeromq. I hope this post have jogged someone's memory in a why the heck it was implemented in this particular way manner.

Cheers,

-- Emil

PS. Feel free to correct/scold me, in case I was entirely wrong on this one.

emount commented 7 years ago

Hi there - I ran across this comment just looking for some good examples of using zmq_addon.h for multipart messaging. I am not a 0MQ expert, but I've done plenty of sockets programming and have been steeping in the 0MQ guide.

From what I understand, I believe this is not actually an issue... because the underlying calls to socket.send() with the ZMQ_SENDMORE flag indicated do not actually send any data yet. Only at the final call, in which the "more" flag is deasserted, is the entire multipart message atomically sent to the underlying socket.

A cool side-effect of this, which I'm fairly certain is correct (I've inferred this myself, but I've built network interface and DMA logic for FPGAs, and had to write Linux drivers for them) is this: because of the way this works, when used in conjunction with zero-copy tactics, messages are sent down to the OS's network stack in such a way that, if the actual network adapter hardware has SGDMA at its disposal, it can implement true zero-copy. That is, all the various multiple parts of the message are gathered from their disparate memory locations by DMA at the point in which the packet is actually being enqueued for presentation to the MAC layer... super efficient.

Feel free to poke holes, anyone. Just my two cents.