zeromq / libzmq

ZeroMQ core engine in C++, implements ZMTP/3.1
https://www.zeromq.org
Mozilla Public License 2.0
9.45k stars 2.34k forks source link

Assert in `zmq::object_t::process_command` (src/object.cpp:170) #4649

Open Joacchim opened 6 months ago

Joacchim commented 6 months ago

Hello,

First of all, thanks for the great work on this library. I know it's not new, and quite popular, but you guys deserve the thanks nonetheless :gift_heart:. This issue is, depending on your answers, either a call for help (understanding what we do wrong), or a bug report :shrug:.

My team is using ZMQ behind its python3 binding, in order to handle communications between multiple processes (python's multiprocessing Process). While it's working fine most of the time since we've set it up (during 2023), we've recently encountered an odd assertion within what I believe to be the ZMQ code, since the file path seems to match, and one of the remaining processed logged ZMQ errors right after the one that encountered the assert went down.

Connection model: multiple 1-to-1 Dealer/Router connections between multiple processes. Usage of threads: Yes

Alas, we did not have CoreDumps activated on that server, and did not see the issue reproduced since (not so long ago, it might at some point, though).

From what I understand of the code, the assert could be related to the command_t.type field being set to command_t::done. As I am clearly not an experienced ZMQ user, I lack the context allowing me to understand what could have happened.

Environment

Minimal test code / Steps to reproduce the issue

Sorry, we've only encountered the issue once, and I lack information about the exact issue to be able to slap together a reproduction case.

What's the actual result? (include assertion message & call stack if applicable)

Message caught by our systemd unit's journal:

Assertion failed: false (src/object.cpp:170)

What's the expected result?

Probably having a ZMQ error that we could handle in the python code somehow ?