Closed stintel closed 9 months ago
Ah, OK. That's something I can track down. Perhaps I will make a small example that uses the callbacks in a similar manner, and see if that recreates the problem. If not, it would still be a good example!
Quick update... I was able to recreate the crash. I started with the _dynsubscribe.rs example, then trimmed it down to the basics of using callbacks for reconnecting and sure enough, upon successful reconnect, it segfaults. I'll see if I can get it fixed and hopefully released this week.
Just a quick update... It seems like the C library is being thrown off by the call to re-connect coming from within the connection-lost handler.
From what I can tell, the C lib fires the "lost" callback early in the process of unwinding its connection status. By trying to make a new connection at that point seems to mess up its internal client data structure so that it erroneously calls an onFailure callback for the old connection using the Token
for the new connection attempt, thus consuming the new token. When it later fails the new connection attempt, it fires the failure callback a second time using the address of the token that was just consumed, but now that address is now invalid. Which will result in either a segfault (as the token is accessed) or double free (when it is free'd a second time).
I assume this is something that the C lib should be allowing us to do - calling connect() from within a connection-lost callback. So for now, I'm assuming it's a bug and opened an issue in that repo. https://github.com/eclipse/paho.mqtt.c/issues/1341
I'm waiting on Paho C, so will need to put this off to the next release, which I will put out as soon as that library is updated, or I can figure a workaround.
I'm waiting on Paho C, so will need to put this off to the next release, which I will put out as soon as that library is updated, or I can figure a workaround.
Understood. Meanwhile I can use v0.11 so it's not urgent for me.
If I rollback to v0.11 the subscriber gives me ERROR [-16] Wrong MQTT version
It looks like this is fixed with the upcoming v0.12.3 release. At least I can't recreate it in testing. If you still see it with that version, please feel free to reopen.
I'm hitting
free(): double free detected in tcache 2
with 0.12 after the set_connection_lost_callback, when reconnecting to a non-SSL broker. When reconnecting to an SSL broker, I'm getting a segfault. Neither of these crashes occur with 0.11.This happens with https://codeberg.org/stintel/auditd-forwarder on Gentoo, both using glibc and musl. I have downgraded paho-mqtt to 0.11 as described in https://codeberg.org/stintel/auditd-forwarder/commit/39d269646107a5b5ed326fcf46809f43e4f20268. To reproduce it, changing the dependency to 0.12.0, connecting to an MQTT broker, then restarting that MQTT broker should trigger the issue. It's hardcoded to use SSL but you could change that in https://codeberg.org/stintel/auditd-forwarder/src/branch/main/src/main.rs#L223. Example config for the app: https://codeberg.org/stintel/auditd-forwarder/src/branch/main/src/config.rs#L71-L77 - this should go in /etc/auditd-forwarder.toml. You don't need to be running auditd, you can just
nc -lkU /tmp/unix.sock
and use that path in the config.