zeromq / libzmq

ZeroMQ core engine in C++, implements ZMTP/3.1
https://www.zeromq.org
Mozilla Public License 2.0
9.46k stars 2.34k forks source link

Crash in `zmq::dist_t::distribute`; `_matching` > # pipes #4638

Closed rfdhaptx closed 6 months ago

rfdhaptx commented 7 months ago

Issue description

Crash in zmq::dist_t::distribute. Attempting to index into array _pipes with index i=1, but _pipes only has 1 element in it.

At the crash point:

Environment

What's the actual result? (include assertion message & call stack if applicable)

ucrtbased.dll!_CrtDbgReport(int report_type=2, const char * file_name=0x00007ff7b727ed20, int line_number=1553, const char * module_name=0x0000000000000000, const char * format=0x00007ff7b727615c, ...) Line 263
std::vector<zmq::pipe_t *,std::allocator<zmq::pipe_t *>>::operator[](const unsigned __int64 _Pos=1) Line 1552
zmq::array_t<zmq::pipe_t,2>::operator[](unsigned __int64 index_=1) Line 87
zmq::dist_t::distribute(zmq::msg_t * msg_=0x000000054b1fe818) Line 175
zmq::dist_t::send_to_matching(zmq::msg_t * msg_=0x000000054b1fe818) Line 154
zmq::xpub_t::xsend(zmq::msg_t * msg_=0x000000054b1fe818) Line 321
zmq::socket_base_t::send(zmq::msg_t * msg_=0x000000054b1fe818, int flags_=2) Line 1259
s_sendmsg(zmq::socket_base_t * s_=0x000001797171cec0, zmq_msg_t * msg_=0x000000054b1fe818, int flags_=2) Line 381
zmq_send(void * s_=0x000001797171cec0, const void * buf_=0x00000179746eef30, unsigned __int64 len_=16, int flags_=2) Line 409
zmq::detail::socket_base::send(zmq::const_buffer buf={...}, zmq::send_flags flags=sndmore) Line 1969
axelriet commented 6 months ago

Can you try and repro with the latest code? Not saying it's fixed but it's a little challenging to go back in time.

ijprest-haptx commented 6 months ago

FWIW, I think we were holding it wrong. Turns out we had multiple threads PUB-ing messages. We serialized them onto a single thread, and the issue hasn't repro'd since. It has been >10 days without a repro, so I'm confident in saying that this was our problem, rather than yours... so go ahead and close this. Thanks!

mfortin-adata commented 3 months ago

FWIW, I think we were holding it wrong. Turns out we had multiple threads PUB-ing messages. We serialized them onto a single thread, and the issue hasn't repro'd since. It has been >10 days without a repro, so I'm confident in saying that this was our problem, rather than yours... so go ahead and close this. Thanks!

Thank you, your ticket helped me find a similar problem on my end. I had a single thread sending pub-ing messages, but the method was called from different threads accessing the object through a synchronized weak link; except for the main thread which had a direct pointer to it. I wouldn't have found it without this ticket :)