Closed umitanuki closed 8 years ago
Another instance.
(gdb) bt
#0 0x00007f0b4eb06cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f0b4eb0a0d8 in __GI_abort () at abort.c:89
#2 0x00007f0b428b0949 in zmq::zmq_abort (errmsg_=errmsg_@entry=0x7f0b428e81ee "check ()")
at bundled/zeromq/src/err.cpp:84
#3 0x00007f0b428a71a7 in zmq::msg_t::size (this=0x7f0ae8a050f0) at bundled/zeromq/src/msg.cpp:248
#4 0x00007f0b428e5d62 in zmq::v2_encoder_t::message_ready (this=0x7f0ae8a7a2c0)
at bundled/zeromq/src/v2_encoder.cpp:53
#5 0x00007f0b428abf21 in zmq::stream_engine_t::out_event (this=0x7f0ae8a050d0)
at bundled/zeromq/src/stream_engine.cpp:363
#6 0x00007f0b428bd6b3 in zmq::session_base_t::read_activated (this=0x7f0ae8a05540, pipe_=0x7f0ae8a10cb0)
at bundled/zeromq/src/session_base.cpp:264
#7 0x00007f0b428cf40c in zmq::io_thread_t::in_event (this=0x3dc7b60) at bundled/zeromq/src/io_thread.cpp:83
#8 0x00007f0b428df82e in zmq::epoll_t::loop (this=0x3dc7de0) at bundled/zeromq/src/epoll.cpp:176
#9 0x00007f0b428c47a0 in thread_routine (arg_=0x3dc7e60) at bundled/zeromq/src/thread.cpp:96
#10 0x00007f0b4ee9d182 in start_thread (arg=0x7f0b28a00700) at pthread_create.c:312
#11 0x00007f0b4ebca47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) info threads
Id Target Id Frame
20 Thread 0x7f0b039b0700 (LWP 30296) 0x00007f0b4eea43bd in read () at ../sysdeps/unix/syscall-template.S:81
19 Thread 0x7f0b459c3700 (LWP 30282) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
18 Thread 0x7f0b451c2700 (LWP 30283) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
17 Thread 0x7f0b449c1700 (LWP 30284) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
16 Thread 0x7f0b34e6e700 (LWP 30287) pthread_barrier_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71
15 Thread 0x7f0b29201700 (LWP 30289) 0x00007f0b4ebcab13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
14 Thread 0x7f0b061b5700 (LWP 30291) sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
13 Thread 0x7f0b471c6700 (LWP 30279) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
12 Thread 0x7f0b3466d700 (LWP 30288) pthread_barrier_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71
11 Thread 0x7f0b481c8700 (LWP 30277) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
10 Thread 0x7f0b461c4700 (LWP 30281) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
9 Thread 0x7f0b4f2c7740 (LWP 30273) 0x00007f0b4ebbd12d in poll () at ../sysdeps/unix/syscall-template.S:81
8 Thread 0x7f0b479c7700 (LWP 30278) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
7 Thread 0x7f0b469c5700 (LWP 30280) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
6 Thread 0x7f0b031af700 (LWP 30297) 0x00007f0b4ebc1da3 in select () at ../sysdeps/unix/syscall-template.S:81
5 Thread 0x7f0b051b3700 (LWP 30293) 0x00007f0b4ebc1da3 in select () at ../sysdeps/unix/syscall-template.S:81
4 Thread 0x7f0b049b2700 (LWP 30294) 0x00007f0b49250606 in __pyx_pf_6pandas_5algos_196is_monotonic_int64 (
__pyx_self=<optimized out>, __pyx_v_timelike=1, __pyx_v_arr=<optimized out>) at pandas/algos.c:74482
3 Thread 0x7f0b041b1700 (LWP 30295) 0x00007f0b4ebc1da3 in select () at ../sysdeps/unix/syscall-template.S:81
2 Thread 0x7f0b059b4700 (LWP 30292) sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
* 1 Thread 0x7f0b28a00700 (LWP 30290) 0x00007f0b4eb06cc9 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
(gdb) p u
$1 = {base = {metadata = 0x0, unused = '\000' <repeats 45 times>, type = 0 '\000', flags = 0 '\000'}, vsm = {
metadata = 0x0, data = '\000' <repeats 44 times>, size = 0 '\000', type = 0 '\000', flags = 0 '\000'}, lmsg = {
metadata = 0x0, content = 0x0, unused = '\000' <repeats 37 times>, type = 0 '\000', flags = 0 '\000'}, cmsg = {
metadata = 0x0, data = 0x0, size = 0, unused = '\000' <repeats 29 times>, type = 0 '\000', flags = 0 '\000'},
delimiter = {metadata = 0x0, unused = '\000' <repeats 45 times>, type = 0 '\000', flags = 0 '\000'}}
I found the socket is not thread-safe, as described in the doc. Using forwarder device, I'm not seeing this issue anymore.
I'm running a python server that uses two ZMQ sockets, one for PUB and one for REP. The server runs python's threads up to 8, and using numpy etc. Intermittently, it aborts like below.
Seems like msg is broken.
Here's the threads info.
I suspected thread 2 which is freeing something, but it is completely unrelated to the main issue.
Any idea?