eclipse / paho.mqtt.python

paho.mqtt.python
Other
2.18k stars 724 forks source link

Client stops receiving messages, eventhough the broker is alive #626

Closed BasKloet closed 8 months ago

BasKloet commented 2 years ago

Hello, I have a python script that listens for MQTT from my thermostat messages (using paho) and updates my home automation accordingly. Both the program creating the MQTT messages, as well as the script that handles them run as Linux services on the same raspberry pi. They work very well together in most situations, but I have noticed the following problems:

  1. Every day at around midnight my listener script stops receiving MQTT messages, even though the other program keeps publishing them and the broker does not restart.
  2. If I restart the service that sends the MQTT messages, the receiving script will not receive the MQTT messages that are published after the restart, even though the messages are published to the same topics after the restart and the broker is unchanged.

I have a simple workaround that restarts the listener service at midnight and whenever the sending service restarts, but I feel that there is probably an underlying problem in my script that I should actually fix. My script is as follows:

#!/usr/bin/env python3.8
# -*- coding: utf-8 -*-
#
import random
import requests
import signal
import sys
from datetime import datetime

from paho.mqtt import client as mqtt_client

broker = 'localhost'
port = 1883
topic = "evohome/evogateway/ctl_controller/#"
# generate client ID with pub prefix randomly
client_id = f'python-mqtt-{random.randint(0, 100)}'
domoticz_url = "http://192.168.2.50:8080"
system_mode_dict = {
  "heat_off": 0,
  "auto": 10,
  "eco_boost": 20,
  "away": 30,
  "day_off": 40,
  "day_off_eco": 50,
  "auto_with_reset": 60,
  "custom": 70, 
}

def sigterm_handler(_signo, _stack_frame):
    # Raises SystemExit(0):
    sys.exit(0)

def connect_mqtt() -> mqtt_client:
    def on_connect(client, userdata, flags, rc):
        if rc == 0:
            print("Connected to MQTT Broker!")
        else:
            print("Failed to connect, return code %d\n", rc)

    client = mqtt_client.Client(client_id)
    client.on_connect = on_connect
    client.connect(broker, port)
    return client

def subscribe(client: mqtt_client):
    def on_setpoint_message(client, userdata, msg):
        dt = datetime.now().strftime("%d/%m/%Y %H:%M:%S")
        #print(f"{dt} - Setpoint: `{msg.payload.decode()}` from `{msg.topic}` topic")
        response = requests.get(f"{domoticz_url}/json.htm?type=command&param=setsetpoint&idx=245&setpoint={msg.payload.decode()}")  

    def on_temp_message(client, userdata, msg):
        dt = datetime.now().strftime("%d/%m/%Y %H:%M:%S")
        #print(f"{dt} - Temperature: `{msg.payload.decode()}` from `{msg.topic}` topic")
        response = requests.get(f"{domoticz_url}/json.htm?type=command&param=udevice&idx=246&nvalue=0&svalue={msg.payload.decode()}")   

    def on_system_mode_message(client, userdata, msg):
        dt = datetime.now().strftime("%d/%m/%Y %H:%M:%S")
        print(f"{dt} - System mode: `{msg.payload.decode()}` from `{msg.topic}` topic")
        url = f"{domoticz_url}/json.htm?type=command&param=switchlight&idx=247&switchcmd=Set%20Level&level={system_mode_dict[msg.payload.decode()]}"
        print(url)
        response = requests.get(url)    
        print(response)

    client.subscribe(topic)
    client.message_callback_add("evohome/evogateway/ctl_controller/temperature/temperature", on_temp_message)
    client.message_callback_add("evohome/evogateway/ctl_controller/setpoint/setpoint", on_setpoint_message)
    client.message_callback_add("evohome/evogateway/ctl_controller/system_mode/system_mode", on_system_mode_message)

def run():
    client = connect_mqtt()
    subscribe(client)
    client.loop_forever()

if __name__ == '__main__':
    signal.signal(signal.SIGTERM, sigterm_handler)
    try:
        run()
    except KeyboardInterrupt:
        print('Interrupted')
        sys.exit(0)

It's not the prettiest script in the world, but I'd like to get it fully functioning and robust before I clean it up. Can anyone help me point out what I'm doing wrong or give me tips on how to analyze this problem further?

Thanks!

schef commented 2 years ago

Hi guys, i have similar issue to this one where my client continues to publish messages but stops receiving them. It is started through systemd and restart fixes the problem.

Here is my source code.

Only logs i get looks like this:

Dec 19 21:53:48.463784 alarmpi python[292]: [19.12 21:53:48.462] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:53:50.104132 alarmpi python[292]: [19.12 21:53:50.102] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:53:51.855444 alarmpi python[292]: [19.12 21:53:51.846] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:53:53.584828 alarmpi python[292]: [19.12 21:53:53.583] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:53:55.373861 alarmpi python[292]: [19.12 21:53:55.372] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:53:57.115493 alarmpi python[292]: [19.12 21:53:57.106] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:53:58.885446 alarmpi python[292]: [19.12 21:53:58.875] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:54:00.594539 alarmpi python[292]: [19.12 21:54:00.593] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:54:02.342644 alarmpi python[292]: [19.12 21:54:02.341] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:54:04.064335 alarmpi python[292]: [19.12 21:54:04.063] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:54:05.784226 alarmpi python[292]: [19.12 21:54:05.782] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:54:07.484318 alarmpi python[292]: [19.12 21:54:07.483] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:54:09.265547 alarmpi python[292]: [19.12 21:54:09.253] [INF] [MQTT]: CONNACK received with code Success.
Dec 19 21:54:10.994314 alarmpi python[292]: [19.12 21:54:10.993] [INF] [MQTT]: CONNACK received with code Success.
MattBrittan commented 2 years ago

I'd suggest moving the call to subscribe into on_connect (see the example in the readme).

If the connection goes down and the client reconnects then any existing subscriptions will be lost (clean_session defaults to True so subscriptions will not survive a loss of connection). Calling subscribe from the on_connect callback means that the subscription will be reinstated following reconnection. See this answer from Roger.

schef commented 2 years ago

Thanks. I have updated the code and will see what happens. :)

fffonceca commented 10 months ago

@MattBrittan Hi!. Keeping clean_session=False should do the trick of mantaining previous subscribes. So why after a long period of time the client losses all receiving messages but broker appears to be OK?

MattBrittan commented 10 months ago

So why after a long period of time the client losses all receiving messages but broker appears to be OK?

In theory a qos1+ subscription can last forever (MQTT V3) if you always connect with cleansession=false. However real life does not always match up to theory and things go wrong (e.g. broker restarted and fails to load sessions from storage) and the spec is not always followed (sometimes for good reason, many brokers set a limit to how long a session state is retained).

If you have a specific problem I'd suggest asking it in a different issue (or ideally somewhere like stackoverflow if it's not a bug in this client).

MattBrittan commented 8 months ago

Closing as it looks like an answer was provided (and the question has been inactive for some time).