jamiebegin / metrics2mqtt

Publish cross-platorm system performance metrics to a MQTT broker.
MIT License
63 stars 9 forks source link

Reconnection logic needs improvement #7

Open bachya opened 4 years ago

bachya commented 4 years ago

I daemonize metrics2mqtt via the suggested method (using supervisor). I'm finding that when I restart my MQTT broker, metrics2mqtt errors out quite rapidly – so rapidly, in fact, that at some point, supervisor gives up. Example:

2020-08-12 21:52:20,388 - metrics2mqtt - ERROR - Error while trying to connect to MQTT broker.
2020-08-12 21:52:20,389 - metrics2mqtt - ERROR - [Errno 111] Connection refused
Traceback (most recent call last):
  File "/usr/local/bin/metrics2mqtt", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/metrics2mqtt/base.py", line 201, in main
    stats.connect()
  File "/usr/local/lib/python3.8/site-packages/metrics2mqtt/base.py", line 40, in connect
    self.client.connect(self.broker_host)
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 937, in connect
    return self.reconnect()
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 1071, in reconnect
    sock = self._create_socket_connection()
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 3522, in _create_socket_connection
    return socket.create_connection(addr, source_address=source, timeout=self._keepalive)
  File "/usr/local/lib/python3.8/socket.py", line 808, in create_connection
    raise err
  File "/usr/local/lib/python3.8/socket.py", line 796, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
2020-08-12 21:52:20,432 INFO exited: metrics2mqtt (exit status 1; not expected)
2020-08-12 21:52:23,438 INFO spawned: 'metrics2mqtt' with pid 36
2020-08-12 21:52:23,863 - metrics2mqtt - ERROR - Error while trying to connect to MQTT broker.
2020-08-12 21:52:23,863 - metrics2mqtt - ERROR - [Errno 111] Connection refused
Traceback (most recent call last):
  File "/usr/local/bin/metrics2mqtt", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/metrics2mqtt/base.py", line 201, in main
    stats.connect()
  File "/usr/local/lib/python3.8/site-packages/metrics2mqtt/base.py", line 40, in connect
    self.client.connect(self.broker_host)
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 937, in connect
    return self.reconnect()
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 1071, in reconnect
    sock = self._create_socket_connection()
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 3522, in _create_socket_connection
    return socket.create_connection(addr, source_address=source, timeout=self._keepalive)
  File "/usr/local/lib/python3.8/socket.py", line 808, in create_connection
    raise err
  File "/usr/local/lib/python3.8/socket.py", line 796, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
2020-08-12 21:52:23,921 INFO exited: metrics2mqtt (exit status 1; not expected)
2020-08-12 21:52:24,923 INFO gave up: metrics2mqtt entered FATAL state, too many start retries too quickly

The only way to fix this is to restart supervisor.

After examining the relevant section of the code, I think the problem is that you raise an exception after logging an error message; too many exceptions too quickly will choke supervisor. I don't see that exception being caught anywhere?

bachya commented 4 years ago

FYI, ran into this again today. Any thoughts?

lipoja commented 3 years ago

I have the same issue when trying to run it manually. However it is failing correctly, because my MQTT server is not listening on default port and you can not set the port. Are you sure you can access the MQTT server?

bachya commented 3 years ago

@lipoja Definitely. And even after the MQTT server has been up and accessible for a while, this library never recovers.

lipoja commented 3 years ago

@bachya Do you have patch for this issue? I wanted to give a try to this - I am currently running everything through MQTT at home. Next time I have to check also all the forks, I started refactoring the code as well and I saw that you've already done some changes as well.

Are you using it, would you recommend it or should I search for something else?

bachya commented 3 years ago

@lipoja I don't personally have a patch for this; given this issue's stagnancy, I've gone back to using Glances.

lipoja commented 3 years ago

@bachya Oh, I have to check Glance. It looks pretty good. Yes, you are right it seems that this projects froze. Thank you for the suggestion. Have a nice day :]

seaniedan commented 3 years ago

The main reason I wanted to use this repo was because it seemed lightweight and used MQTT, but it seems now Glances also has an option to use MQTT. If anyone can help with setting that up in HomeAssistant, preferably with auto 'discovery' I'd appreciate the tip!