zeromq / zmqpp

0mq 'highlevel' C++ bindings
http://zeromq.github.io/zmqpp
Mozilla Public License 2.0
438 stars 197 forks source link

Thread hangs while creating ZMQ context #221

Open yogeshj83 opened 6 years ago

yogeshj83 commented 6 years ago

In my project I'm using ZMQ 4.2.2 and ZMQPP for communication between the two daemons. Platform is RHEL 6 and RHEL 7. But when a daemon tries to get the zmqpp::context object, it goes in a sleep state. Below is the gstack o/p:

0 0x00007eff26c26650 in __nanosleep_nocancel () from /lib64/libc.so.6

1 0x00007eff26c26504 in sleep () from /lib64/libc.so.6

2 0x00007eff2545f4f1 in randombytes (x=0x7ffda14c8570 "\020(h\001", xlen=4) at src/tweetnacl.c:928

3 0x00007eff253ea8dc in zmq::ctx_t::ctx_t (this=0x16ee230) at src/ctx.cpp:98

4 0x00007eff25456260 in zmq_ctx_new () at src/zmq.cpp:163

5 0x00007eff2653fbf4 in context (this=0x7eff267e0508 <IPCCommunication::getInstance()::Instance+8>) at /home/yogesh_joshi/workspaces/680/build/tools/zmq/4.2.2/include/zmqpp/context.hpp:79

6 IPCCommunication::IPCCommunication (this=0x7eff267e0500 <IPCCommunication::getInstance()::Instance>) at ../tools/ipc/IPCCommunication.cpp:6

It remains in this state for long time. I'm not sure what is happening here. I have a daemon which loads a shared object which in turn links to libzmq.so and libzmqpp.so

Any help will be appreciated.

-Yogesh

benjamg commented 6 years ago

At a guess you don't have enough entropy for the randombytes function to return. I don't know enough about tweetnacl but I do know that is a thing that can occur when pulling from dev/rand.

I'm not sure what to suggest as a solution here though sorry.

yogeshj83 commented 6 years ago

Actually I doubt if it is an issue with entropy. As I mentioned, Im using zmq for communication between two daemons on the same machine. The other daemon is successfully able to get the zmqpp::context. Whereas, this daemon hangs while doing same. The only noticeable difference I can see between the two daemons, I'm linking libzmq and libzmqpp directly to the executable of a daemon (one that is working). Whereas the daemon which has issue loads a shared library which in turn links to libzmq.so and libzmqpp.so. Not sure if this is relevant but this is the difference I can see. Do you think if this issue can be related to GCC version?

-Yogesh

yogeshj83 commented 6 years ago

While going through the ZMQ site, I came across their latest release viz 4.2.3. In their release notes I noticed that they have fixed one race condition related tweetnacl https://github.com/zeromq/libzmq/releases/tag/v4.2.3 Fixed #2632 - Fix file descriptor leak when using Tweetnacl (internal NACL implementation) instead of Libsodium, and fix race condition when using multiple ZMQ contexts with Tweetnacl

My issue got resolved when I compiled my code with ZMQ 4.2.3

Regards, Yogesh