icanos / hassio-plejd

Hass.io add-on for Plejd home automation devices
Apache License 2.0
126 stars 36 forks source link

[Feature Request/Discussion] Healthcheck/ping #186

Open JohnLindahlTech opened 3 years ago

JohnLindahlTech commented 3 years ago

TL;DR: Can we have some kind of MQTT-based health-check to see if everything is up and connected?

I am hardening my monitoring of my home network and home automations, because I've had problems with services going down without prior notice.

My optimal flow would in this case be:

I have found the getAvailabilityTopic, but I see a few problems to use it for my usecase:

  1. Sending is intiated from hassio-plejd, which means that it falls under a passive check, which has less credibility from the monitoring-service.
  2. To get the actual topic name I need to know a device-id which is not really feasible for a system-check-monitoring endpoint.
  3. (I have not determined if this topic actually reflect status if Plejd and BLE is actually working or just sends messages when the MQTT stuff is going online/offline?)

See this issue as a discussion if the feature would fit this repo or not (It would be nice for me not to have to maintain a diverging fork 😉).

JohnLindahlTech commented 3 years ago

Just saw @SweVictor's issue #170 which I think goes hand in hand with the train of thought I am having here.

SweVictor commented 3 years ago

I agree the current availability feature is a bit broken (only sets to unavailable when the addon shuts down). To me the best way would be to link it to the periodic ping of the BLE network that already occurs. Whenever BLE ping fails the write queue loop is stopped, we could then also send unavailable.

Available/unavailable should with the last PR:s be retained so you could get them at any time from the mqtt broker.

This would though mean that from the HA side you cannot "queue up" commands while the BLE network is down (at least not through the UI). It's probably more intuitive, but might be bad for automations.

Thoughts?

JohnLindahlTech commented 3 years ago

Availability in my head is very much connected to the BLE ping loop.

the extreme async-ness of queueing up commands from HA does not make sense to me at all, that should be handled at the source (HA) cause in all my cases it would make no sense to giva a command in HA (manual or automated) and then I do not know if it Will be executed on now or later. We never know how long the Plejd/BLE outage Will be, and turning on a light in 5hours just because it is the lates queued event is a lot more problematic rather than the alternative of ”instant” knowing status.

edit: also, to iterate, it would preferable for my usecase to have a system wide topic for availability and not just for every device, or have I misunderstood the availability-logic?

Disclaimer: this comment was written on phone while watching over playing infant.