vitaliy-sk / keenetic-grafana-monitoring

Monitor Keenetic router with Grafana and InfluxDB
Apache License 2.0
76 stars 15 forks source link

Running on Keenetic router - stops after some time #28

Open ilker-aktuna opened 6 days ago

ilker-aktuna commented 6 days ago

I am running on Keenetic router and I see that the process stops after some time: image

How can I make the service check itself and restart if gone ?

ilker-aktuna commented 5 days ago

I'd appreciate any help. Thanks.

ilker-aktuna commented 5 days ago

also is there a way to understand why the process stops ? how can I troubleshoot ?

ilker-aktuna commented 4 days ago

I got the following output from command line. It occurs randomly. My Grafana server is on another location. I understand that when the script can not reach the server it crashes. How can we handle this situation and stop crashing ?


2024-11-12 11:47:03,805 - keentic_influxdb_exporter.py - INFO - Configuration done. Start collecting with interval: 30 sec
Traceback (most recent call last):
  File "/opt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/urllib3/connection.py", line 507, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/http/client.py", line 1386, in getresponse
  File "/opt/lib/python3.11/http/client.py", line 325, in begin
  File "/opt/lib/python3.11/http/client.py", line 286, in _read_status
  File "/opt/lib/python3.11/socket.py", line 706, in readinto
TimeoutError: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/home/keenetic-grafana-monitoring/keentic_influxdb_exporter.py", line 126, in <module>
    infuxdb_writer.write_metrics(metrics)
  File "/opt/home/keenetic-grafana-monitoring/influxdb_writter.py", line 18, in write_metrics
    self._write_api.write(bucket=self._configuration['bucket'], org=self._configuration['org'], record=metrics)
  File "/opt/lib/python3.11/site-packages/influxdb_client/client/write_api.py", line 371, in write
    results = list(map(write_payload, payloads.items()))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/client/write_api.py", line 369, in write_payload
    return self._post_write(_async_req, bucket, org, final_string, payload[0])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/client/write_api.py", line 517, in _post_write
    return self._write_service.post_write(org=org, bucket=bucket, body=body, precision=precision,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/service/write_service.py", line 62, in post_write
    (data) = self.post_write_with_http_info(org, bucket, body, **kwargs)  # noqa: E501
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/service/write_service.py", line 166, in post_write_with_http_info
    return self.api_client.call_api(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/api_client.py", line 341, in call_api
    return self.__call_api(resource_path, method,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/api_client.py", line 171, in __call_api
    response_data = self.request(
                    ^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/api_client.py", line 386, in request
    return self.rest_client.POST(url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/rest.py", line 304, in POST
    return self.request("POST", url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/influxdb_client/rest.py", line 217, in request
    r = self.pool_manager.request(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/urllib3/_request_methods.py", line 143, in request
    return self.request_encode_body(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/urllib3/_request_methods.py", line 278, in request_encode_body
    return self.urlopen(method, url, **extra_kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/urllib3/poolmanager.py", line 443, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/urllib3/util/retry.py", line 449, in increment
    raise reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/urllib3/util/util.py", line 39, in reraise
    raise value
  File "/opt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 538, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/opt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 369, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='192.168.254.20', port=8086): Read timed out. (read timeout=5.997026729994104)
/opt/home/keenetic-grafana-monitoring # 
ilker-aktuna commented 4 days ago

I know you don't like to be rushed, but I've been waiting for 3 days and I bet this is very easy for you to fix. If I knew python, I'd do it myself, however I don't.

Someone else who has python knowledge that could address this problem ?

ilker-aktuna commented 2 days ago

thnaks to ChatGPT, fix:

# Inside influxdb_writer.py
from urllib3.exceptions import ReadTimeoutError, MaxRetryError, NewConnectionError

def write_metrics(self, metrics):
    try:
        self._write_api.write(bucket=self._configuration['bucket'], org=self._configuration['org'], record=metrics)
    except ReadTimeoutError:
        print("Error: The connection to InfluxDB timed out.")
    except MaxRetryError:
        print("Error: Max retries exceeded while connecting to InfluxDB.")
    except NewConnectionError:
        print("Error: Unable to establish a connection to InfluxDB.")
    except Exception as e:
        print("An unexpected error occurred:", str(e))
ilker-aktuna commented 2 days ago

still, I would like to make sure service will restart when it stops. How can I make it on Keenetic router ? is there anything like Cron ?