berezhinskiy / ecoflow_exporter

Prometheus exporter for EcoFlow portable power stations
GNU General Public License v3.0
241 stars 44 forks source link

Make idle_reconnect More Resilient #45

Closed aauren closed 9 months ago

aauren commented 10 months ago

First off, thanks so much for creating this exporter for Prometheus. I was having a really hard time understanding the performance of my ecoflow before I found this project.

Over time, I found that I would lose metrics from the exporter. When I looked into it more, I found that most of the problems happened in the idle_reconnect() method. When something happens to this method that either causes the idle_timer thread to error or lock, the exporter will no longer keep the MQTT client alive and no new metrics will be exported.

At first, this happened because of an SSL TimeoutError that happened in the MQTT client.connect() function. Easy enough, I added some exception handling around that and it worked again.

However, after letting it run for long enough, I noticed that there exists a somewhat frequent case where something inside client.connect() will deadlock and never return. After playing around with it for a bit, I eventually decided to just fork the process that does the connection handling inside idle_reconnect() as this handles both cases and will ensure that no matter what, the idle_timer thread will never stop reconnection attempts.

I also found a bug where idle_reconnect() will never activate if the ecoflow hasn't been online since the exporter was started because without messages received self.last_message_time was never initialized. I have fixed that as well with this update.

I've been running the exporter this way for several weeks without any problems.

tarik02 commented 9 months ago

Just tested it, works good. Nice work, thank you!