eclipse / paho.mqtt.python

paho.mqtt.python
Other
2.15k stars 724 forks source link

subscriptions stop receiving data #737

Open MarcEngrie opened 12 months ago

MarcEngrie commented 12 months ago

Struggling with a problem for over 2 months now, debugging and changing code to overcome my problem but so far, without succes. So that is why I post it here Setting: I am running a Python (3.9.2) application on a RPi 4B 2GB running Debian GNU/Linux 11 (bullseye) Linux aarch64 V6.1.21-v8+ where I use the the package paho-mqtt V1.6.1. My mosquitto server runs on a Windows PC. I have 6 devices publishing every (60 sec + random(60 sec)) some data. Graphing this data with Grafana+Telegraf (subscribing to topics) works like a charm. Not data lost. My python app, connects and subscribes to these topics using eg : mqtt_subscribe(co2_topicbase + "/#") to capture all (at the moment 5) CO2 sensors sending data. I use # as co2 sensors might come and go ans so I do not need to change application at every change in sensors. (NB: other sensor is a waterflow sensor but setup is the same). So I config paho client is this way: config instance with ID, clean_session = False, login, pw ; setting willset, setting on ... callbacks, loop_start(), connect. On_connect, I subscribe to topics. On_message I figure out which topic came in and store the data of payload in dictionary. I set a flag, data came in. I have a inifinte while loop where I display some data and when flag is true, handle the data which include also being publish on the MQTT server but on another topic. This will work for 1 hour, 5 hours, 12 hours untill suddenly, the app stops receiving data. There is no disconnection from the MQTT server as I see no notification of disconnection in my log from on_disconnect. My last program change was to monitor the time between 2 data arrivals. When over 5 minutes, I do a reconnect to the server But even this will not cure the problem. Also I do a reboot every midnight (to clear memory in case of any memory leak) but also this brings no improvement. What am I doing wrong? What could be the cause of this 'behaviour'? Are there any settings I need to do?

CamDavidsonPilon commented 12 months ago

An idea: is it possible that your on_message is freezing up / failing and blocking? How simple is the on_message callback?

skorokithakis commented 11 months ago

I have exactly the same problem, on every one of my services. They will eventually stop receiving data without any sign of disconnection.

Eventually I made one log, and this is what the on_log function prints (userdata, level, buf, in order):

None 16 Sending PINGREQ
None 16 Received PINGRESP
None 16 Sending PINGREQ
None 16 Received PINGRESP
None 16 Sending PINGREQ
None 16 Received PINGRESP
None 16 Sending PINGREQ
None 16 Received PINGRESP
None 16 Sending PINGREQ
None 16 Received PINGRESP

It's still receiving PINGREQ/PINGRESP, but no MQTT messages at all.

A thought occurred just now: When MQTT reconnects, does it resubscribe? I just noticed that the subscription message is in on_connect of one of my apps, but in the constructor of the other.

I've had this issue for years now.

MattBrittan commented 6 months ago

"MQTT reconnects, does it resubscribe"

The subscription may be part of the session state (if various criteria, e.g. clean_session, are met; these differ between v3 and v5); however even in this case it may not always survive (e.g. broker not storing session to disk). This is why we recommend subscribing in on_connect (as per the examples).

@MarcEngrie are you still seeing this issue with the current release? Is there any chance you could share some of your code (or logs). Broker logs (log_type all) for around the time the messages stop would also be very useful. Unfortunately this kind of issue can be very difficult to track down and the more info you can provide the better (it's quite possible that one of the fixes in the upcoming release will help).

danclimasevschi commented 5 months ago

@MarcEngrie @skorokithakis what version of paho-mqtt are you using?

I am also having the same randomly occurring problem, but apparently it happens less often in 1.5.0 than in 1.6.1.

FWIW: It happens only on topics with "higher" traffic, while other subscriptions continue to receive data as normal.

MarcEngrie commented 5 months ago

@MarcEngrie @skorokithakis what version of paho-mqtt are you using?

I am also having the same randomly occurring problem, but apparently it happens less often in 1.5.0 than in 1.6.1.

FWIW: It happens only on topics with "higher" traffic, while other subscriptions continue to receive data as normal. @skorokithakis using 1.6.1 but downgrading is currently not an option. How do you define 'higher traffic'. FYI: I 'only' post data (5 topics) every minute. I do not consider this a 'high traffic'.

MarcEngrie commented 5 months ago

@skorokithakis using 1.6.1 but downgrading is currently not an option. How do you define 'higher traffic'. FYI: I 'only' post data (5 topics) every minute. I do not consider this a 'high traffic'.

danclimasevschi commented 5 months ago

@skorokithakis tens, if not hundreds per second