Drolla / WavePlus_Bridge

Airthings Wave Plus Bridge to Wifi/LAN and MQTT
MIT License
21 stars 9 forks source link

MQTT publish makes the WavePlus bridge hanging #3

Closed Drolla closed 2 years ago

Drolla commented 2 years ago

Since the MQTT publishing feature has been added and enabled, the WavePlus bridge was hanging a few times. After disabling the feature again (in the YAML configuration file), this issue could not be observed anymore, which gives a high indication that the problem is related to the MQTT publishing feature.

rkoshak commented 2 years ago

I've observed a similar behavior. If it's related to MQTT, I wonder if it has something to do with the network loop. I didn't see anywhere in the code where the client was created nor where loop, loop_start or loop_forever was called. It's been awhile since I've used paho-mqtt but the last time I did it was required to manage the MQTT pub/sub loop thread to ensure that messages get sent. I can't say it that has anything to do with the problem. When I have time I'll run some experiments

rkoshak commented 2 years ago

I threw in some logging and it appears to get stuck when mqtt_msgs is empty by the time you get down to the call to publish the messages. It appears paho-mqtt's publish.multiple can't handle an empty list of messages to send. Adding a test to skip the publish when there are no messages to send appears to avoid the hanging.

Drolla commented 2 years ago

Thanks for your debugging of this issue and your commit of a solution, @rkoshak . The fix is finally much simpler than I thought. I explored already the possibility to use a threaded version of the MQTT publishing procedure to keep a certain control if this thread is hanging. But your solution is of course a much better one! Please give me a few days to validate whether your fix resolves all the issues I have seen. Afterwards I will merge your pull request into the master tree.

rkoshak commented 2 years ago

It's been running well for me for about three hours now but so far so hopefully it's all good.

Personally, ultimately I'd like to see an actual persistent connection made to the broker so we can take advantage of things like the LWT and such. But for now this seems to meet my needs and I can monitor whether the program is up and reporting in other ways.

I've a pretty self contained MQTT communicator class in one of my repos if you ever want to look at an example (or reuse it). https://github.com/rkoshak/sensorReporter/blob/main/mqtt/mqtt_conn.py

Drolla commented 2 years ago

Yep, it runs now on my side also for several days without any problems. Thanks again for your submit, @rkoshak - I have merged your updates into the main branch. Sorry for my ignorance of the details of the MQTT protocol, but I see that the publish.multiple method contains also an optional 'will' argument. Is this different from the LWT you mention? What would be the other advantages to use a persistant connection to a broker for our present case where a publishing is only required every minute and no subscription is required? Thanks for helping me to understand me this better.

rkoshak commented 2 years ago

Honestly, I'm not sure what that LWT option will do. The way LWT works is when the client connects to the broker it registers a LWT topic and message. The broker then sets up a heartbeat between itself and the client. If the client fails to respond to the heartbeat the broker publishes the LWT message to the LWT topic.

This the LWT message indicates whether or not the client is online or offline.

With the fire and forget approach currently used I don't see how the LWT message can do anything useful because there is no persistent connection between the client and the broker. The offline message would be published every time a set of messages are sent.

The advantage of the LWT is that other clients get pretty immediate information as to whether the bridge is online or not. In my case I can program some remedial actions to take place to restore connectivity.

Without this I'd just have to assume that if it doesn't publish for too long it's offline. But that's more work as I have to create a watchdog process instead of using something already built into MQTT.

To do this a persistent MQTT connection would have to be set up though. As part of that I'd also publish an online message to the LWT topic and both the online and offline messages are retained. That way a client can know the online status even if it wasn't connected to the broker when those messages were sent.

On a related note, I'd also publish a timestamp at the time messages are published and publish all the messages with retained true. That way every client subscribing to the topics will always get the last readings and know how old they are.

There is a new feature in the MQTT spec that let's one time out a retained message which would be even better than a timestamp.

If I were really ambitious, I'd use the Homie library and specification. That's a standard to implement MQTT which means that other software, such as openHAB or Home Assistant can automatically detect and configure the device to work with itself.

Drolla commented 2 years ago

Yep, the LWT option in combination with the publish.single and publish.multiple methods of the Paho MQTT library may not make any sense. I tried to trigger that the broker publishes the LWT message, without any success. So I don't know why these methods support the Will/LWT parameter. It is a good idea to publish also the timestamp. Regarding the Homie library/specification, I will have a look into that. I have captured your inputs in a new improvement request (https://github.com/Drolla/WavePlus_Bridge/issues/6).