zeromq / libzmq

ZeroMQ core engine in C++, implements ZMTP/3.1
https://www.zeromq.org
Mozilla Public License 2.0
9.64k stars 2.35k forks source link

Router does not reject duplicate identity #4503

Open BenediktBurger opened 1 year ago

BenediktBurger commented 1 year ago

Issue description

If I set the same identity to two sockets (DEALER) and connect them to one ROUTER socket, the first connected DEALER works fine, while the other one does connect, but no message arrives at the ROUTER socket. According to the manual, the second connection should be refused:

"If two clients use the same routing id when connecting to a ROUTER, the results shall depend on the ZMQ_ROUTER_HANDOVER option setting. If that is not set (or set to the default of zero), the ROUTER socket shall reject clients trying to connect with an already-used routing id."

However no error message is raised.

See also #2010

Environment

Minimal test code / Steps to reproduce the issue

You can use https://github.com/zeromq/pyzmq/issues/1646

  1. start a server (parameters: "s")
  2. start a client (parameters: "c T1")
  3. start a second client (parameters: "c T2")
  4. send a message via second client and it hangs

What's the actual result? (include assertion message & call stack if applicable)

Message of second client does not arrive.

What's the expected result?

An error message, that the second client cannot connect.

evojkollar commented 3 months ago

@BenediktBurger I have the same issue. Did you manage to figure this out by any chance?

BenediktBurger commented 3 months ago

In LECO (and the python implementation PyLECO), we use the default random ids. Instead, the clients inform the server about their name (via json rpc encoded message). From that moment on, the server connects the ID and the name, such that it can route messages based on names.

So no, I did not solve this problem, but wrote an alternative.

evojkollar commented 3 months ago

Thank you for the quick reply. Might have to do the same. Strange that ZMQ_ROUTER_HANDOVER does not work.

BenediktBurger commented 3 months ago

Maybe it's relevant for you: Here is our protocol definition: https://github.com/pymeasure/leco-protocol and here is the python implementation of that protocol definition: https://github.com/pymeasure/pyleco

chiefnoah commented 1 week ago

I'm here confirming that ZMQ_ROUTER_HANDOVE in both configured states does not appear to work at all. I'm getting unrecoverable hangs on my REQ->ROUTER sockets. Changing to using other IDs is not an option for us, so we have to figure out how to implement a timeout or something. How is this not considered a critical bug?

chiefnoah commented 1 week ago

Or rather, if you have ZMQ_ROUTER_HANDOVER set to 1 it will at least time out the old client, but we don't want the handover, we want the new connected to be rejected/terminated.