inuits / mqttgateway

Prometheus meets the Internet of Things
Other
76 stars 20 forks source link

does not reconnect to mqtt on connection closed #5

Open rtoma opened 4 years ago

rtoma commented 4 years ago

First off, I like your work. I am using it to integrate node-red flows with prometheus.

I noticed metrics in prometheus go stale when the mqtt server gets restarted. Even with --log.level=debug enabled, mqttgateway does not seem to notice its mqtt connection got closed so it does not reconnect and its cache of metrics do not get updated anymore, while prometheus keeps scraping. My golang skills are pretty dusty/absent, so I hope someone can look into this? Thanks.

roidelapluie commented 4 years ago

Thanks for you report! I'll look into it.

roidelapluie commented 4 years ago

Alternatively there seems to be a more popular option for mqtt here: https://github.com/hikhvar/mqtt2prometheus

rtoma commented 4 years ago

I looked at hikhvar/mqtt2prometheus, but I like yours better because it requires (nearly) zero configuration. Obviously it depends on the use case.

roidelapluie commented 4 years ago

Ok, I will look into your issue.

roidelapluie commented 4 years ago

I could not reproduce but I have refreshed the exporter with an update of the libraries + mqtt logging. Can you have a look?

roidelapluie commented 4 years ago

FYI there is a v0.0.1 release with binaries and docker images.

2m commented 3 years ago

I think I also faced this issue where the container was still running, but no new messages were being consumed. Here are the last logs:

F time="2021-03-14T09:25:14Z" level=warning msg="[net] logic stopped" source="logger.go:41"
F time="2021-03-14T09:25:14Z" level=error msg="[net] incoming stopped with errorread tcp 100.64.13.142:40008->34.251.215.118:19689: read: connection reset by peer" source="logger.go:51"
F time="2021-03-14T09:25:14Z" level=error msg="[net] error triggered, stopping" source="logger.go:51"

Not much interesting information here. :)

I was using the latest mqttgateway image.

2m commented 3 years ago

Just got the same issue, but this time with debug logging turned on. Seems like the reconnection is successful but the subscriptions are not recreated after reconnection?

Here are the logs (note that latest log is at the top):

F time="2021-03-22T04:37:44Z" level=debug msg="[net] received pingresp" source="logger.go:31"
F time="2021-03-22T04:37:44Z" level=debug msg="[net] logic waiting for msg on ibound" source="logger.go:31"
F time="2021-03-22T04:37:44Z" level=debug msg="[net] logic got msg on ibound" source="logger.go:31"
F time="2021-03-22T04:37:44Z" level=debug msg="[net] Received Message" source="logger.go:31"
F time="2021-03-22T04:37:44Z" level=debug msg="[pinger] keepalive sending ping" source="logger.go:31"
F time="2021-03-22T04:37:44Z" level=debug msg="[pinger] ping check30.00331268" source="logger.go:31"
F time="2021-03-22T04:37:39Z" level=debug msg="[pinger] ping check25.003170847" source="logger.go:31"
F time="2021-03-22T04:37:34Z" level=debug msg="[pinger] ping check20.003400522" source="logger.go:31"
F time="2021-03-22T04:37:29Z" level=debug msg="[pinger] ping check15.003060249" source="logger.go:31"
F time="2021-03-22T04:37:24Z" level=debug msg="[pinger] ping check10.00320704" source="logger.go:31"
F time="2021-03-22T04:37:19Z" level=debug msg="[pinger] ping check5.003530949" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[net] outgoing waiting for an outbound message" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[net] outgoing started" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[net] logic waiting for msg on ibound" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[net] logic started" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[pinger] keepalive starting" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[net] incoming started" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[client] client is reconnected" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[net] received connack" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[client] socket connected to broker" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[net] connect started" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[client] Using MQTT 3.1.1 protocol" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=error msg="[net] incoming stopped with errorread tcp 100.64.11.135:60626->34.251.215.118:19689: read: connection reset by peer" source="logger.go:51"
F time="2021-03-22T04:37:14Z" level=error msg="[net] error triggered, stopping" source="logger.go:51"
F time="2021-03-22T04:37:14Z" level=debug msg="[net] outgoing stopped" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=warning msg="[net] logic stopped" source="logger.go:41"
F time="2021-03-22T04:37:14Z" level=debug msg="Connection lost:read tcp 100.64.11.135:60626->34.251.215.118:19689: read: connection reset by peer" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[client] about to write new connect msg" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[pinger] keepalive stopped" source="logger.go:31"
F time="2021-03-22T04:37:14Z" level=debug msg="[client] enter reconnect" source="logger.go:31"
F time="2021-03-22T04:37:12Z" level=debug msg="[pinger] ping check24.999003351" source="logger.go:31"
F time="2021-03-22T04:37:07Z" level=debug msg="[pinger] ping check19.99908629" source="logger.go:31"
F time="2021-03-22T04:37:02Z" level=debug msg="[pinger] ping check14.998967307000001" source="logger.go:31"
roidelapluie commented 3 years ago

Thanks for these logs! I will investigate.