Closed isra17 closed 6 years ago
Some more logs:
I: 18-04-15 14:12:01 loading configuration from '/tmp/malamute.cfg'...
N: 18-04-15 14:12:01 server is using NULL security
N: 18-04-15 14:12:01 binding Malamute service to 'tcp://*:9999'
D: 18-04-15 14:12:03 597:Malamute : start:
D: 18-04-15 14:12:03 597:Malamute : CONNECTION_OPEN
D: 18-04-15 14:12:03 597:Malamute : $ register new client
I: 18-04-15 14:12:03 client 597 address='client' - registering
D: 18-04-15 14:12:03 597:Malamute : $ send OK
D: 18-04-15 14:12:03 597:Malamute : $ check for mailbox messages
D: 18-04-15 14:12:03 597:Malamute : > connected
D: 18-04-15 14:12:03 597:Malamute : connected:
D: 18-04-15 14:12:03 597:Malamute : SERVICE_SEND
D: 18-04-15 14:12:03 597:Malamute : $ write message to service
D: 18-04-15 14:12:03 597:Malamute : > connected
D: 18-04-15 14:12:04 597:Malamute : connected:
D: 18-04-15 14:12:04 597:Malamute : CONNECTION_PING
D: 18-04-15 14:12:04 597:Malamute : $ send CONNECTION_PONG
D: 18-04-15 14:12:04 597:Malamute : > connected
D: 18-04-15 14:12:04 597:Malamute : connected:
D: 18-04-15 14:12:04 597:Malamute : SERVICE_SEND
D: 18-04-15 14:12:04 597:Malamute : $ write message to service
D: 18-04-15 14:12:04 597:Malamute : > connected
D: 18-04-15 14:12:05 597:Malamute : connected:
D: 18-04-15 14:12:05 597:Malamute : CONNECTION_PING
D: 18-04-15 14:12:05 597:Malamute : $ send CONNECTION_PONG
D: 18-04-15 14:12:05 597:Malamute : > connected
D: 18-04-15 14:12:05 597:Malamute : connected:
D: 18-04-15 14:12:05 597:Malamute : SERVICE_SEND
D: 18-04-15 14:12:05 597:Malamute : $ write message to service
D: 18-04-15 14:12:05 597:Malamute : > connected
D: 18-04-15 14:12:06 597:Malamute : connected:
D: 18-04-15 14:12:06 597:Malamute : CONNECTION_PING
D: 18-04-15 14:12:06 597:Malamute : $ send CONNECTION_PONG
D: 18-04-15 14:12:06 597:Malamute : > connected
D: 18-04-15 14:12:06 597:Malamute : connected:
D: 18-04-15 14:12:06 597:Malamute : SERVICE_SEND
D: 18-04-15 14:12:06 597:Malamute : $ write message to service
D: 18-04-15 14:12:06 597:Malamute : > connected
D: 18-04-15 14:12:07 597:Malamute : connected:
D: 18-04-15 14:12:07 597:Malamute : CONNECTION_PING
D: 18-04-15 14:12:07 597:Malamute : $ send CONNECTION_PONG
D: 18-04-15 14:12:07 597:Malamute : > connected
D: 18-04-15 14:12:07 597:Malamute : connected:
D: 18-04-15 14:12:07 597:Malamute : SERVICE_SEND
D: 18-04-15 14:12:07 597:Malamute : $ write message to service
D: 18-04-15 14:12:07 597:Malamute : > connected
D: 18-04-15 14:12:08 597:Malamute : connected:
D: 18-04-15 14:12:08 597:Malamute : CONNECTION_PING
D: 18-04-15 14:12:08 597:Malamute : $ send CONNECTION_PONG
D: 18-04-15 14:12:08 597:Malamute : > connected
D: 18-04-15 14:12:08 597:Malamute : connected:
D: 18-04-15 14:12:08 597:Malamute : SERVICE_SEND
D: 18-04-15 14:12:08 597:Malamute : $ write message to service
D: 18-04-15 14:12:08 597:Malamute : > connected
D: 18-04-15 14:12:09 597:Malamute : connected:
D: 18-04-15 14:12:09 597:Malamute : CONNECTION_PING
D: 18-04-15 14:12:09 597:Malamute : $ send CONNECTION_PONG
D: 18-04-15 14:12:09 597:Malamute : > connected
D: 18-04-15 14:12:09 597:Malamute : connected:
D: 18-04-15 14:12:09 597:Malamute : SERVICE_SEND
D: 18-04-15 14:12:09 597:Malamute : $ write message to service
D: 18-04-15 14:12:09 597:Malamute : > connected
D: 18-04-15 14:12:09 598:Malamute : start:
D: 18-04-15 14:12:09 598:Malamute : CONNECTION_OPEN
D: 18-04-15 14:12:09 598:Malamute : $ register new client
I: 18-04-15 14:12:09 client 598 address='service' - registering
D: 18-04-15 14:12:09 598:Malamute : $ send OK
D: 18-04-15 14:12:09 598:Malamute : $ check for mailbox messages
D: 18-04-15 14:12:09 598:Malamute : > connected
D: 18-04-15 14:12:09 598:Malamute : connected:
D: 18-04-15 14:12:09 598:Malamute : SERVICE_OFFER
D: 18-04-15 14:12:09 598:Malamute : $ store service offer
D: 18-04-15 14:12:09 598:Malamute : $ send OK
D: 18-04-15 14:12:09 598:Malamute : $ dispatch the service
D: 18-04-15 14:12:09 598:Malamute : connected:
D: 18-04-15 14:12:09 598:Malamute : service_message
D: 18-04-15 14:12:09 598:Malamute : $ get message to deliver
D: 18-04-15 14:12:09 598:Malamute : $ send SERVICE_DELIVER
D: 18-04-15 14:12:09 598:Malamute : > connected
lt-malamute: src/zmsg.c:213: zmsg_content_size: Assertion `self' failed.
Aborted (core dumped)
Sorry for the lots of comments, first time diffing into zmq/malamute sources, trying to pin point the issue as I would love to run this in production with Python :| Looking at the msg triggering the assert, it looks like a use after free issue:
$14 = {
sender = 0x7fffe4010420 "",
address = 0x7fffe4010520 "\260\v\001\344\377\177",
subject = 0x7fffe400e820 " \005\001\344\377\177",
tracker = 0x7fffe400df80 <incomplete sequence \350>,
expiry = 1523816818792,
content = 0x0,
refcount = 0x0
}
Valgrind report confirming the issue:
==19750== Thread 5:
==19750== Invalid read of size 8
==19750== at 0x4E60CBC: mlm_msg_content (mlm_msg.c:106)
==19750== by 0x4E641E7: s_dequeue (mlm_msgq.c:124)
==19750== by 0x4E6428D: mlm_msgq_dequeue_cursor (mlm_msgq.c:144)
==19750== by 0x4E51943: s_service_dispatch (mlm_server.c:299)
==19750== by 0x4E52527: dispatch_the_service (mlm_server.c:623)
==19750== by 0x4E4EDAB: s_client_execute (mlm_server_engine.inc:862)
==19750== by 0x4E50F37: s_server_handle_protocol (mlm_server_engine.inc:1564)
==19750== by 0x50A4E64: zloop_start (in /usr/lib/libczmq.so.4.1.1)
==19750== by 0x4E510E5: mlm_server (mlm_server_engine.inc:1611)
==19750== by 0x5081376: ??? (in /usr/lib/libczmq.so.4.1.1)
==19750== by 0x668E08B: start_thread (in /usr/lib/libpthread-2.26.so)
==19750== by 0x5678E7E: clone (in /usr/lib/libc-2.26.so)
==19750== Address 0x7b61408 is 40 bytes inside a block of size 56 free'd
==19750== at 0x4C2E10B: free (vg_replace_malloc.c:530)
==19750== by 0x4E60C7D: mlm_msg_destroy (mlm_msg.c:74)
==19750== by 0x4E60E71: mlm_msg_unlink (mlm_msg.c:153)
==19750== by 0x4E525F5: get_message_to_deliver (mlm_server.c:668)
==19750== by 0x4E4F34F: s_client_execute (mlm_server_engine.inc:977)
==19750== by 0x4E4D580: engine_send_event (mlm_server_engine.inc:278)
==19750== by 0x4E51892: s_service_dispatch_message (mlm_server.c:281)
==19750== by 0x4E5192F: s_service_dispatch (mlm_server.c:298)
==19750== by 0x4E52527: dispatch_the_service (mlm_server.c:623)
==19750== by 0x4E4EDAB: s_client_execute (mlm_server_engine.inc:862)
==19750== by 0x4E50F37: s_server_handle_protocol (mlm_server_engine.inc:1564)
==19750== by 0x50A4E64: zloop_start (in /usr/lib/libczmq.so.4.1.1)
==19750== Block was alloc'd at
==19750== at 0x4C2EEF5: calloc (vg_replace_malloc.c:711)
==19750== by 0x4E60A0A: safe_malloc (czmq_prelude.h:635)
==19750== by 0x4E60AD6: mlm_msg_new (mlm_msg.c:44)
==19750== by 0x4E52376: write_message_to_service (mlm_server.c:584)
==19750== by 0x4E4EC6D: s_client_execute (mlm_server_engine.inc:838)
==19750== by 0x4E50F37: s_server_handle_protocol (mlm_server_engine.inc:1564)
==19750== by 0x50A4E64: zloop_start (in /usr/lib/libczmq.so.4.1.1)
==19750== by 0x4E510E5: mlm_server (mlm_server_engine.inc:1611)
==19750== by 0x5081376: ??? (in /usr/lib/libczmq.so.4.1.1)
==19750== by 0x668E08B: start_thread (in /usr/lib/libpthread-2.26.so)
==19750== by 0x5678E7E: clone (in /usr/lib/libc-2.26.so)
==19750==
Using malamute built from
master
.How to reproduce:
Service provider:
Publisher:
malamute.cfg
Run malamute first:
malamute malamute.cfg
Then simply start the service provider and publisher. If it doesn't crash right away, simply restart the service provider until it does. Eventually, malamute will abort with a failed assertion:malamute: src/zmsg.c:213: zmsg_content_size: Assertion
self' failed.`Stacktrace:
Note that interrupting the service worker also result in an assert abort from the Python process:
The publisher doesn't abort on SIGINT (It actually require multiples SIGINT before it actually exit).