FergusInLondon / Tapo-P110-Prometheus-Exporter

Prometheus Exposition for TP-Link TAPO P110 devices.
22 stars 18 forks source link

Does not resume after timeout/disconnect #3

Open PovilasID opened 2 years ago

PovilasID commented 2 years ago

Hey,

So I noticed that after I disconnect P110 and reconnect it somewhere it exporter does not resume sending data but gets stuck in exception loop:

2022-09-28 18:40:43.736 | INFO     | collector:collect:117 - recieving prometheus metrics scrape: collecting observations
2022-09-28 18:40:43.736 | INFO     | collector:collect:123 - performing observations for device
2022-09-28 18:40:43.737 | DEBUG    | collector:get_device_data:111 - retrieving energy usage statistics for device
2022-09-28 18:40:45.739 | DEBUG    | collector:time_observation:81 - observation completed
2022-09-28 18:40:45.739 | ERROR    | collector:collect:137 - encountered exception during observation!
Traceback (most recent call last):

  File "urllib3/connection.py", line 174, in _new_conn

  File "urllib3/util/connection.py", line 95, in create_connection

  File "urllib3/util/connection.py", line 85, in create_connection

TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "urllib3/connectionpool.py", line 703, in urlopen

  File "urllib3/connectionpool.py", line 398, in _make_request

  File "urllib3/connection.py", line 239, in request

  File "http/client.py", line 1282, in request

  File "http/client.py", line 1328, in _send_request

  File "http/client.py", line 1277, in endheaders

  File "http/client.py", line 1037, in _send_output

  File "http/client.py", line 975, in send

  File "urllib3/connection.py", line 205, in connect

  File "urllib3/connection.py", line 179, in _new_conn

urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0xffffb741c910>, 'Connection to 192.168.1.111 timed out. (connect timeout=2)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "requests/adapters.py", line 489, in send

  File "urllib3/connectionpool.py", line 787, in urlopen

  File "urllib3/util/retry.py", line 592, in increment

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='192.168.1.111', port=80): Max retries exceeded with url: /app?token=03C932481BA0B24D722CB74D7E18374F (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0xffffb741c910>, 'Connection to 192.168.1.111 timed out. (connect timeout=2)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "threading.py", line 973, in _bootstrap

  File "threading.py", line 1016, in _bootstrap_inner

  File "threading.py", line 953, in run

  File "socketserver.py", line 683, in process_request_thread

  File "socketserver.py", line 360, in finish_request

  File "socketserver.py", line 747, in __init__

  File "wsgiref/simple_server.py", line 134, in handle

  File "wsgiref/handlers.py", line 137, in run

  File "prometheus_client/exposition.py", line 128, in prometheus_app

  File "prometheus_client/exposition.py", line 104, in _bake_output

  File "prometheus_client/openmetrics/exposition.py", line 21, in generate_latest

  File "prometheus_client/registry.py", line 97, in collect

> File "collector.py", line 128, in collect
    data = self.get_device_data(device, ip_addr, room)['result']
           │    │               │       │        └ 'home'
           │    │               │       └ '192.168.1.111'
           │    │               └ <PyP100.PyP110.P110 object at 0xffffb737d5a0>
           │    └ <function Collector.get_device_data at 0xffffb7389750>
           └ <collector.Collector object at 0xffffb737d240>

  File "collector.py", line 114, in get_device_data
    return device.getEnergyUsage()
           │      └ <function P110.getEnergyUsage at 0xffffb7388dc0>
           └ <PyP100.PyP110.P110 object at 0xffffb737d5a0>

  File "PyP100/PyP110.py", line 31, in getEnergyUsage

  File "requests/sessions.py", line 635, in post

  File "requests/sessions.py", line 587, in request

  File "requests/sessions.py", line 701, in send

  File "requests/adapters.py", line 553, in send

requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='192.168.1.111', port=80): Max retries exceeded with url: /app?token=03C932481BA0B24D722CB74D7E18374F (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0xffffb741c910>, 'Connection to 192.168.1.111 timed out. (connect timeout=2)'))

I think after timeout it should running a login script to reestablish a session. Restarting service works.

AlexXZero commented 10 months ago

Hi @PovilasID , I found that you also forked your repo from the same source as me. I also find similar issue and fixed it in my repo. You are welcome to back port any changes you find useful: https://github.com/AlexXZero/Tapo-P110-Prometheus-Exporter/tree/master Also you can ask me any extra questions e.g. I can provide you my grafana configuration or my docker compose configuration. I didn't add them into the repo since I used a few types of smart plugs (kasa hs110 and tapo P110 in the same grafana dashboard).

PovilasID commented 10 months ago

@AlexXZero Cool! I have gotten a pull request that probably fixed it https://github.com/PovilasID/P110-Exporter/pull/2 not sure though will need to test it :) In general I think this exporter needs to migrate to a different python lib for tapo devices because this one is not well supported...

AlexXZero commented 10 months ago

@PovilasID I saw your fix, but I'm not sure if it is the best solution, since it adds reconnection only during initialisation, but I found that it would be good to allow reconnect to the device in runtime, e.g. sometimes I need to move the smartplug from one room to another, then it will lost connection and it need to reestablish it. So I would suggest to not try to reconnect in initialisation (just try to connect once), then in getter data check if connection is not established, then try to reconnect. (There is example how I did it in my fork: https://github.com/AlexXZero/Tapo-P110-Prometheus-Exporter/commit/1ad6f6c90375151e3c3c5b62c7963bad58ee52e5)

In terms of migration to another python library, I also would like to do it, but now I don't have enough time to sorted out with a new library API, so I just found fixed version of PyP100 library which supports a new protocol and use it. But I would be interested to backport migration to another library in case if someone has been done it before me.