kpetremann / mqtt-exporter

Simple generic MQTT Prometheus exporter for IoT working out of the box
https://hub.docker.com/r/kpetrem/mqtt-exporter
MIT License
103 stars 29 forks source link

Once unavailable always unavailable even after becoming available in zigbee2mqtt #48

Closed hfreire closed 1 year ago

hfreire commented 1 year ago

Describe the bug Devices that were marked as unavailable by zigbee2mqtt never get available once zigbee2mqtt deems them available.

If the bug is related to parsing, please provide the original MQTT message.

To Reproduce Steps to reproduce the behavior:

  1. Let a device be declared as unavailable in zigbee2mqtt.
  2. Verify that exporter correctly shows it as unavailable (0).
  3. Make the device be declared as available in zigbee2mqtt.
  4. Verify that exporter incorrectly does not handle the availability and continues to report it as unavailable indefinitely (0).

Expected behavior The exporter should handle unavailable-to-available and correctly report it be available (1).

Screenshots

zigbee2mqtt_zigbee_availability{topic="zigbee2mqtt_Isabel's room multi sensor"} 0.0
Screenshot 2023-03-01 at 12 03 18
kpetremann commented 1 year ago

Hi @hfreire, thanks for reporting the issue.

I have some questions:

I'll test this today, as I have a sensor which ran out of battery recently.

kpetremann commented 1 year ago

also, can you confirm that legacy availability payload is still disabled in zigbee2mqtt? see here

kpetremann commented 1 year ago

I was not able to reproduce the issue with zigbe2mqtt 1.30.1.

before replacing the battery: sensor_zigbee_availability{sensor="zigbee2mqtt_chambre"} 0.0

immediately after replacing the battery: sensor_zigbee_availability{sensor="zigbee2mqtt_chambre"} 1.0

I wonder what are the differences.

hfreire commented 1 year ago

Hi @hfreire, thanks for reporting the issue.

I have some questions:

  • which version of zigbee2mqtt do you use?

1.29.2

  • when you restart the exporter, is there an availability metric? if so, with which value?

Yes, it seems by restarting it picked the correct value:

zigbee2mqtt_zigbee_availability{topic="zigbee2mqtt_Isabel's room multi sensor"} 1.0
  • did you see a message in MQTT when the availability comes back to up?

I did not, besides that, I can confirm that the retained message is there:

Screenshot 2023-03-01 at 13 01 09 Screenshot 2023-03-01 at 13 01 34

Legacy is disabled in zigbee2mqtt as you can verify on the payload of the retained message.

I'll test this today, as I have a sensor which ran out of battery recently.

kpetremann commented 1 year ago

interesting.

As restarting the exporter fixes the value, I think it might have missed the "back online" message. Do you have a high number of messages per second in MQTT?

Could you please share the logs in DEBUG mode while reproducing the issue?

kpetremann commented 1 year ago

hello @hfreire,

do you have any update?

hfreire commented 1 year ago

interesting.

As restarting the exporter fixes the value, I think it might have missed the "back online" message. Do you have a high number of messages per second in MQTT?

Could you please share the logs in DEBUG mode while reproducing the issue?

I don't believe I have a high number of messages per second, I can also say that number of dropped messages is normally zero, the only spike I see is when I restart home-assistant.

Also, I've noticed another situation, not related with original issue: if I delete an unavailable zigbee2mqtt device, the exporter will continue to advertise it as unavailable.

kpetremann commented 1 year ago

I don't believe I have a high number of messages per second, I can also say that number of dropped messages is normally zero, the only spike I see is when I restart home-assistant.

acked. I saw that after my last message when I a had another look to the screenshots you provided.

Have you been able to reproduce the issue? I was not able to reproduce it and I never encountered it. So debug logs would be very helpful.

Also, I've noticed another situation, not related with original issue: if I delete an unavailable zigbee2mqtt device, the exporter will continue to advertise it as unavailable.

mqtt-exporter is generic and stateless. It cannot know that a device has been removed. It simply expose metrics it sees in MQTT queue. Even if there was a deletion message in MQTT, the Python Prometheus library does not permit to remove metrics, at least not without doing something very hacky. For this usecase, it is not perfect but a simple restart of the mqtt-exporter will do the trick.

kpetremann commented 1 year ago

I am closing this issue. Do not hesitate to re-open it if the issue happens again.