zeromq / malamute

The ZeroMQ Enterprise Messaging Broker
Mozilla Public License 2.0
324 stars 77 forks source link

Malamute crashes when two clients connect with same address #316

Open isra17 opened 6 years ago

isra17 commented 6 years ago

Simple Python POC

import time
from malamute import MalamuteClient

c1 = MalamuteClient()
c1.connect(b'tcp://localhost:9999', 5000, b'a')
c2 = MalamuteClient()
c2.connect(b'tcp://localhost:9999', 5000, b'a')
input('Still alive')
c2 = None #c1 = None give the same result.
input('Now its dead')

It appears that when two connection are made with the same address, as soon as one address close, the broker crashes. The crash is abit racy. Remove the input calls and it crashes pretty much everytime. Playing with timing might yield better or worse crash occurences.

GDB Output:

I: 18-04-19 15:02:28 loading configuration from '/tmp/malamute.cfg'...
[New Thread 0x7ffff4f09700 (LWP 313)]
[New Thread 0x7ffff4708700 (LWP 314)]
[New Thread 0x7ffff3f07700 (LWP 315)]
[New Thread 0x7ffff3706700 (LWP 316)]
[New Thread 0x7ffff2f05700 (LWP 317)]
N: 18-04-19 15:02:28 server is using NULL security
N: 18-04-19 15:02:28 binding Malamute service to 'tcp://*:9999'
D: 18-04-19 15:02:32    795:Malamute                         : start:
D: 18-04-19 15:02:32    795:Malamute                         :     CONNECTION_OPEN
D: 18-04-19 15:02:32    795:Malamute                         :         $ register new client
I: 18-04-19 15:02:32 client 795 address='a' - registering
D: 18-04-19 15:02:32    795:Malamute                         :         $ send OK
D: 18-04-19 15:02:32    795:Malamute                         :         $ check for mailbox messages
D: 18-04-19 15:02:32    795:Malamute                         :         > connected
D: 18-04-19 15:02:32    796:Malamute                         : start:
D: 18-04-19 15:02:32    796:Malamute                         :     CONNECTION_OPEN
D: 18-04-19 15:02:32    796:Malamute                         :         $ register new client
D: 18-04-19 15:02:32    795:Malamute                         : connected:
D: 18-04-19 15:02:32    795:Malamute                         :     expired
D: 18-04-19 15:02:32    795:Malamute                         :         $ client expired
I: 18-04-19 15:02:32 client 795 address='a' - expired
D: 18-04-19 15:02:32    795:Malamute                         :         $ deregister the client
I: 18-04-19 15:02:32 client 795 address='a' - de-registering
D: 18-04-19 15:02:32    795:Malamute                         :         $ terminate
I: 18-04-19 15:02:32 client 796 address='a' - registering
D: 18-04-19 15:02:32    796:Malamute                         :         $ send OK
D: 18-04-19 15:02:32    796:Malamute                         :         $ check for mailbox messages
D: 18-04-19 15:02:32    796:Malamute                         :         > connected
D: 18-04-19 15:02:32    796:Malamute                         : connected:
D: 18-04-19 15:02:32    796:Malamute                         :     CONNECTION_CLOSE
D: 18-04-19 15:02:32    796:Malamute                         :         $ send OK
D: 18-04-19 15:02:32    796:Malamute                         :         $ client closed connection
I: 18-04-19 15:02:32 client 796 address='a' - closed connection
D: 18-04-19 15:02:32    796:Malamute                         :         $ deregister the client
I: 18-04-19 15:02:32 client 796 address='a' - de-registering
D: 18-04-19 15:02:32    796:Malamute                         :         $ terminate
D: 18-04-19 15:02:32    797:Malamute                         : start:
D: 18-04-19 15:02:32    797:Malamute                         :     CONNECTION_CLOSE
D: 18-04-19 15:02:32    797:Malamute                         :         $ send OK
D: 18-04-19 15:02:32    797:Malamute                         :         $ client closed connection

Thread 5 "malamute" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff3706700 (LWP 316)]
0x00007ffff7bbd6ef in client_closed_connection (self=0x7fffe400eee0) at src/mlm_server.c:717
717         if (*self->address)
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
[────────────────────────────────────REGISTERS────────────────────────────────────]
 RAX  0x0
 RBX  0x0
 RCX  0xf
 RDX  0x0
 RDI  0x7fffe400eee0 —▸ 0x7fffe4000b20 —▸ 0x5555557725e0 ◂— 0xcafe0004
 RSI  0x7fffe40008d0 ◂— 0x70104070705
 R8   0x0
 R9   0x7ffff3706700 ◂— 0x7ffff3706700
 R10  0x7ffff744b640 (step4_jumps) ◂— add    byte ptr [rax], al
 R11  0x0
 R12  0x7fffffffe07e ◂— 0x7fffffffe2900000
 R13  0x7fffffffe07f ◂— 0x7fffffffe29000
 R14  0x7fffffffe080 —▸ 0x7fffffffe290 ◂— 0x2
 R15  0x0
 RBP  0x7ffff37058a0 —▸ 0x7ffff37058c0 —▸ 0x7ffff3705910 —▸ 0x7ffff3705990 ◂— ...
 RSP  0x7ffff3705890 —▸ 0x7fffe400eb40 ◂— 0xcafe0002
 RIP  0x7ffff7bbd6ef (client_closed_connection+20) ◂— movzx  eax, byte ptr [rax]
[─────────────────────────────────────DISASM──────────────────────────────────────]
 ► 0x7ffff7bbd6ef <client_closed_connection+20>    movzx  eax, byte ptr [rax]
   0x7ffff7bbd6f2 <client_closed_connection+23>    test   al, al
   0x7ffff7bbd6f4 <client_closed_connection+25>    je     client_closed_connection+61   <0x7ffff7bbd718>
    ↓
   0x7ffff7bbd718 <client_closed_connection+61>    nop
   0x7ffff7bbd719 <client_closed_connection+62>    leave
   0x7ffff7bbd71a <client_closed_connection+63>    ret

   0x7ffff7bbd71b <client_had_exception>           push   rbp
   0x7ffff7bbd71c <client_had_exception+1>         mov    rbp, rsp
   0x7ffff7bbd71f <client_had_exception+4>         sub    rsp, 0x10
   0x7ffff7bbd723 <client_had_exception+8>         mov    qword ptr [rbp - 8], rdi
   0x7ffff7bbd727 <client_had_exception+12>        mov    rax, qword ptr [rbp - 8]
[─────────────────────────────────────SOURCE──────────────────────────────────────]
712     //
713
714     static void
715     client_closed_connection (client_t *self)
716     {
717         if (*self->address)
718             zsys_info ("client %u address='%s' - closed connection", self->unique_id, self->address);
719     }
720
721
[──────────────────────────────────────STACK──────────────────────────────────────]
00:0000│   0x7ffff2f04820 —▸ 0x7fffe8001480 —▸ 0x7fffe4003440 —▸ 0x7ffff79148d8 ◂— ...
01:0008│   0x7ffff2f04828 ◂— 0xffffffffffffffff
02:0010│   0x7ffff2f04830 —▸ 0x7fffe80016b0 ◂— 0x11cafebabe
03:0018│   0x7ffff2f04838 —▸ 0x7fffe8001580 ◂— 0x0
04:0020│   0x7ffff2f04840 —▸ 0x7fffe80014a0 ◂— 0x0
05:0028│   0x7ffff2f04848 —▸ 0x7ffff76faeb6 ◂— cmp    eax, 0xff
06:0030│   0x7ffff2f04850 —▸ 0x7ffff2f04888 ◂— 0x4
07:0038│   0x7ffff2f04858 —▸ 0x7ffff2f04884 ◂— 0x400000002
[────────────────────────────────────BACKTRACE────────────────────────────────────]
 ► f 0     7ffff7bbd6ef client_closed_connection+20
   f 1     7ffff7bb90eb s_client_execute+891
   f 2     7ffff7bbbf38 s_server_handle_protocol+328
   f 3     7ffff7950e65 zloop_start+1169
   f 4     7ffff7bbc0e6 mlm_server+235
   f 5     7ffff792d377
   f 6     7ffff637308c start_thread+220
Program received signal SIGSEGV (fault address 0x0)
pwndbg> bt
#0  0x00007ffff7bbd6ef in client_closed_connection (self=0x7fffe400eee0) at src/mlm_server.c:717
#1  0x00007ffff7bb90eb in s_client_execute (self=0x7fffe400eee0, event=connection_close_event) at src/mlm_server_engine.inc:602
#2  0x00007ffff7bbbf38 in s_server_handle_protocol (loop=0x7fffe4003120, reader=0x7fffe4000bc0, argument=0x7fffe4000b20) at src/mlm_server_engine.inc:1564
#3  0x00007ffff7950e65 in zloop_start () from /usr/lib/libczmq.so.4
#4  0x00007ffff7bbc0e6 in mlm_server (pipe=0x5555557725e0, args=0x5555555558d5) at src/mlm_server_engine.inc:1611
#5  0x00007ffff792d377 in ?? () from /usr/lib/libczmq.so.4
#6  0x00007ffff637308c in start_thread () from /usr/lib/libpthread.so.0
#7  0x00007ffff73cce7f in clone () from /usr/lib/libc.so.6