zeromq / libzmq

ZeroMQ core engine in C++, implements ZMTP/3.1
https://www.zeromq.org
Mozilla Public License 2.0
9.75k stars 2.36k forks source link

BUG: Zmq abort() due to Bad file descriptor (zeromq-4.3.3/src/epoll.cpp:100) #4472

Open jeff277 opened 1 year ago

jeff277 commented 1 year ago

Please use this template for reporting suspected bugs or requests for help.

Issue description

When the zmq server has not been running, the zmq client may crash。

similar problem:https://github.com/zeromq/libzmq/issues/2103

It crashed 3 times a day, I think the frequency is quite high, I can provide more information or independent test cases

Environment

root@OpenWrt:~# opkg info libzmq-nc Package: libzmq-nc Version: 4.3.3-2 Depends: libc, libuuid1, libpthread, librt, libstdcpp6 Provides: libzmq Status: install user installed Architecture: x86_64 Installed-Time: 1670919771

root@OpenWrt:~# uname -a Linux OpenWrt 4.14.241 #0 SMP Thu Jul 29 19:50:28 2021 x86_64 GNU/Linux


code :

This is Makefile for ZeroMQ in OpenWrt.org

PKG_NAME:=zeromq PKG_VERSION:=4.3.3 PKG_RELEASE:=2

PKG_SOURCE:=$(PKG_NAME)-$(PKG_VERSION).tar.gz PKG_SOURCE_URL:=https://github.com/zeromq/libzmq/releases/download/v$(PKG_VERSION) PKG_HASH:=9d9285db37ae942ed0780c016da87060497877af45094ff9e1a1ca736e3875a2

Minimal test code / Steps to reproduce the issue

  1. my initialization code

void zmq_fd_create(xxx_zmq_fd_t zmq_fd) { int snd_hwm = 10; zmq_fd->ctx = new zmq::context_t (1); zmq_fd->client = new zmq::socket_t (zmq_fd->ctx, ZMQ_PUB); zmq_fd->client->setsockopt (ZMQ_SNDHWM, &snd_hwm, sizeof (snd_hwm)); zmq_fd->client->setsockopt(ZMQ_LINGER, 0); zmq_fd->client->connect("tcp://localhost:50000"); }

  1. Note that the ZeroMQ server is not started. No program is listening on port tcp:50000

  2. The zeromq client sends data once per second, the code is as follows:

zmq_fd->client->send(msg, zmq::send_flags::none)

4、The program crashes after running continuously for 1 to 8 hours

What's the actual result? (include assertion message & call stack if applicable)

0 a_crash () at ./arch/x86_64/atomic_arch.h:108

1 abort () at src/exit/abort.c:29

2 0x00007f64bf3315a1 in zmq::zmq_abort(char const*) () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

3 0x00007f64bf330f88 in zmq::epoll_t::add_fd(int, zmq::i_poll_events*) () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

4 0x00007f64bf3521e5 in zmq::tcp_connecter_t::start_connecting() () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

5 0x00007f64bf33fa3b in zmq::poller_base_t::execute_timers() () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

6 0x00007f64bf330d59 in zmq::epoll_t::loop() () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

7 0x00007f64bf352d74 in thread_routine () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

8 0x00007f64bf641a3e in start (p=0x7f64bf056aa8) at src/thread/pthread_create.c:192

9 0x00007f64bf643bf0 in __clone () at src/thread/x86_64/clone.s:22

What's the expected result?

The program runs continuously for 7 days without crashing

jeff277 commented 1 year ago

Here is another call stack information in the same environment

` (gdb) bt

0 a_crash () at ./arch/x86_64/atomic_arch.h:108

1 abort () at src/exit/abort.c:29

2 0x00007f3e5db0b5a1 in zmq::zmq_abort(char const*) () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

3 0x00007f3e5db3f745 in zmq::stream_connecter_base_t::close() () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

4 0x00007f3e5db2c252 in zmq::tcp_connecter_t::start_connecting() () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

5 0x00007f3e5db19a3b in zmq::poller_base_t::execute_timers() () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

6 0x00007f3e5db0ad59 in zmq::epoll_t::loop() () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

7 0x00007f3e5db2cd74 in thread_routine () from /home/lifei/Documents/x86/slab_sys_2.0/staging_dir/target-x86_64_musl/root-x86/usr/lib/libzmq.so.5

8 0x00007f3e5de1ba3e in start (p=0x7f3e5d80daa8) at src/thread/pthread_create.c:192

9 0x00007f3e5de1dbf0 in __clone () at src/thread/x86_64/clone.s:22

Backtrace stopped: frame did not save the PC `