Open ANaablevy opened 2 years ago
https://github.com/zeromq/libzmq/blob/master/src/udp_engine.cpp#L474
Yeah, I guess someone should add that check but I don't know how that meshes with the zeromq documentation. I guess zeromq could also try to send the message and throw if it got errno EMSGSIZE
as a result of a failed sendto
call.
Also related to issue #2009 . I think the cause is just memory corruption due to the unchecked message size. The commenter in issue #2009 also mentioned assert errors, which I have also seen but not from the above toy example.
In real code that exercises this overflow condition, I have seen the following assert:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff79bc859 in __GI_abort () at abort.c:79
#2 0x00007ffff7f11d9d in zmq::zmq_abort (errmsg_=<optimized out>) at src/err.cpp:88
#3 0x00007ffff7f26b82 in zmq::epoll_t::reset_pollout (this=<optimized out>, handle_=<optimized out>) at src/epoll.cpp:154
#4 0x00007ffff7f27aed in zmq::io_object_t::reset_pollout (this=this@entry=0x7ffefc005140, handle_=<optimized out>) at src/io_object.cpp:90
#5 0x00007ffff7f622a7 in zmq::udp_engine_t::out_event (this=0x7ffefc005140) at src/udp_engine.cpp:509
#6 0x00007ffff7f26708 in zmq::epoll_t::loop (this=0x55555561b160) at src/epoll.cpp:202
#7 0x00007ffff7f5cf9f in thread_routine (arg_=0x55555561b1b8) at src/thread.cpp:401
#8 0x00007ffff7b92609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#9 0x00007ffff7ab9293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
I have also seen an assert due to a failed msg.check()
but I did not save the stack trace.
can you send a PR to fix it?
@bluca I can push a PR with the size check and increased buffer size (max is 65,507 bytes when accounting for network overhead) but I don't know if that is the real solution.
I guess the main question is: should zeromq udp support arbitrarily large message sizes? I think the answer is "no" but I'm not sure.
To support arbitrarily large messages, zeromq would be forced to do some sort of internal message batching to account for the UDP size limitations and I'm not sure if that is possible with the thread safety guarantees of the radio/dish socket. There would also have to be additional overhead and error checking to make sure that the entire message was received so that the "all or nothing" behavior of other sockets also applies to radio/dish/udp or maybe that just won't apply to udp. I think the thread safety issue is still a problem but I am not familiar with the whole radio/dish/udp processing pipeline to make that judgement call.
To support arbitrarily large messages, zeromq would be forced to do some sort of internal message batching to account for the UDP size limitations and still maintain the "all or nothing" behavior of other sockets. Or maybe that behavior just won't apply to UDP.
If there is a message size limit, whether hardcoded or based on system limitations, I also don't know if the udp buffer size should be static. Should it be resizable in case UDP can support larger message sizes in the future? It would rarely be resized (i.e. only if the current message size exceeds the last largest message) during the life of the socket so I think that would be acceptable overhead. I think there is also a way to query the max size but I am honestly not familiar with that check. I also don't have a machine with windows or mac os so I might not be the best person to test this code...
Hi any news?
Issue description
Sorry for all of my edits. It is actually on the send side. I've updated my ticket.
It appears that zeromq will segfault when
receivingsending large messages across UDP using the RADIO/DISH protocol. valgrind reports a buffer overflow.I have replicated the problem by sending 32768 byte messages, which isn't that large in my opinion. The code seems to be fine for smaller sizes (e.g. 8192). I'm not sure if that is related to
MAX_UDP_MSG
.Is it documented that there is a message size cap or should zeromq be able to support (i.e. split) larger messages? 8kb or 32kb seems pretty small.
Environment
Minimal test code / Steps to reproduce the issue
What's the actual result? (include assertion message & call stack if applicable)
What's the expected result?
The code doesn't crash or zeromq throws an error indicating that the message is too large.