eclipse / paho.mqtt.c

An Eclipse Paho C client library for MQTT for Windows, Linux and MacOS. API documentation: https://eclipse.github.io/paho.mqtt.c/
https://eclipse.org/paho
Other
1.94k stars 1.09k forks source link

SIGPIPE broken pipe crash in mqtt library #1346

Open smalik007 opened 1 year ago

smalik007 commented 1 year ago

Describe the bug I am using paho.mqtt.cpp library that uses the c sdk, I am experiencing crashes from the c-sdk side. paho.mqtt.cpp version : v1.2.0 paho.mqtt.c version : v1.3.12 I am using MQTTAsync publisher/subcriber. The library is dyncamically linked with the application.

More Details The application is using two client,

  1. Remote client with asynchronous, SSL/TLS encryption connected to cloud.
  2. A local client with asynchronous, without any encryption.

Platform details Machine : aarch64 OS: ubuntu 18.04

To Reproduce The issue is random and usually occurs when mqtt client lost the connection and tries to reconnect.

Log files Crash 1 : SIGPIPE, Broken pipe

Thread 3 "MQTTAsync_rcv" received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x7fadfcb1a0 (LWP 16664)]
0x0000007fb754776c in __GI___writev (fd=<optimized out>, iov=0x7fadfca398, iovcnt=<optimized out>)
    at ../sysdeps/unix/sysv/linux/writev.c:26
26    ../sysdeps/unix/sysv/linux/writev.c: No such file or directory.
(gdb) bt
#0  0x0000007fb754776c in __GI___writev (fd=<optimized out>, iov=0x7fadfca398, iovcnt=<optimized out>)
    at ../sysdeps/unix/sysv/linux/writev.c:26
#1  0x0000007fb7b781e8 in Socket_writev () at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#2  0x0000007fb7b783ec in Socket_putdatas ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#3  0x0000007fb7b84e94 in WebSocket_putdatas ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#4  0x0000007fb7b731fc in MQTTPacket_send ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#5  0x0000007fb7b73b08 in MQTTPacket_send_disconnect ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#6  0x0000007fb7b6b898 in MQTTAsync_closeOnly ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#7  0x0000007fb7b6b958 in MQTTAsync_closeSession ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#8  0x0000007fb7b6972c in nextOrClose () at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#9  0x0000007fb7b6c3f4 in MQTTProtocol_closeSession ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#10 0x0000007fb7b7197c in MQTTProtocol_keepalive ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#11 0x0000007fb7b6c694 in MQTTAsync_retry ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#12 0x0000007fb7b6da08 in MQTTAsync_cycle ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#13 0x0000007fb7b6a6c0 in MQTTAsync_receiveThread ()
    at /home/user/app/target/lib/libpaho-mqtt3as.so.1
#14 0x0000007fb7f69088 in start_thread (arg=0x7fffffd06f) at pthread_create.c:463
#15 0x0000007fb754f0cc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb)
icraggs commented 1 year ago

You shouldn't get SIGPIPEs as there is a call to

 signal(SIGPIPE, SIG_IGN);

in Socket_outInitialize. The return from this call isn't currently checked, but on your (ARM?) platform it looks like it might have failed? There is another call to ignore these signals in Socket_new, triggered by the NOSIGPIPE compile definition.

smalik007 commented 1 year ago

Yes I also thought so, it should be ignored, but sadly it's not. So if I define NOSIGPIPE then it should be correctly ignored right ?

icraggs commented 1 year ago

If the system call works, yes. I would check the error return from the signal(SIGPIPE, SIG_IGN); call to see if that provides any useful information.

smalik007 commented 1 year ago

I tried with defining NOSIGPIPE but the crash is still reproducible.

icraggs commented 1 year ago

You need to look at the return values from

signal(SIGPIPE, SIG_IGN);

and

setsockopt(sock, SOL_SOCKET, SO_NOSIGPIPE, (void)&opt, sizeof(opt))

and what the errno codes are, if the return value is -1. I'm thinking they must fail to result in this behaviour. I don't have your environment to try it out.

bulpper commented 2 months ago

I ran into the same problem with paho.mqtt.c version 1.3.13.

Looks like _signal(SIGPIPE, SIGIGN) doesn't really prevent to broken pipe signal when socket write or send is called and connection is just closing/closed.

Workaround for this might be to use send -method instead with MSG_NOSIGNAL -flag. I already made a fork testing pusposes to fix our system https://github.com/eclipse/paho.mqtt.c/compare/master...bulpper:paho.mqtt.c:sigpipefix