MiguelNdeCarvalho / speedtest-exporter

Speedtest Exporter made in python using the official speedtest bin
https://docs.miguelndecarvalho.pt/projects/speedtest-exporter/
GNU General Public License v3.0
190 stars 52 forks source link

Configuration Error #186

Closed stelgenhof closed 1 year ago

stelgenhof commented 1 year ago

My Speedtest Exporter suddenly stopped working and spawning error messages. Here is a snippet of my docker container log:

1/06/2023 20:49:42 - Server: 48463 | Jitter: 2.347 ms | Ping: 13.94 ms | Download: 234.19 Mb/s | Upload:9.4 Mb/s
WARNING:waitress.queue:Task queue depth is 1
21/06/2023 20:54:47 - Server: 48463 | Jitter: 3.139 ms | Ping: 16.955 ms | Download: 245.58 Mb/s | Upload:9.72 Mb/s
21/06/2023 20:59:51 - Server: 21569 | Jitter: 3.272 ms | Ping: 16.475 ms | Download: 244.45 Mb/s | Upload:9.26 Mb/s
[2023-06-21 21:04:28.047] [error] Configuration - SSL connect error (UnknownException)
[2023-06-21 21:04:28.047] [error] Configuration - Cannot retrieve configuration document (0)
[2023-06-21 21:04:28.048] [error] ConfigurationError - Could not retrieve or read configuration (Configuration)
[2023-06-21 21:04:28.082] [error] ConfigurationError - Could not retrieve or read configuration (Configuration)
{
    "type": "log",
    "timestamp": "2023-06-21T21:04:28Z",
    "message": "Configuration - Could not retrieve or read configuration (ConfigurationError)",
    "level": "error"
}
21/06/2023 21:04:28 - Server: 0 | Jitter: 0 ms | Ping: 0 ms | Download: 0.0 Mb/s | Upload:0.0 Mb/s
[2023-06-21 21:09:28.310] [error] Configuration - SSL connect error (UnknownException)
[2023-06-21 21:09:28.310] [error] Configuration - Cannot retrieve configuration document (0)
[2023-06-21 21:09:28.311] [error] ConfigurationError - Could not retrieve or read configuration (Configuration)
[2023-06-21 21:09:28.412] [error] ConfigurationError - Could not retrieve or read configuration (Configuration)
{
    "type": "log",
    "timestamp": "2023-06-21T21:09:28Z",
    "message": "Configuration - Could not retrieve or read configuration (ConfigurationError)",
    "level": "error"
}

Everything has been running smoothly up until that time of 2023-06-21 21:04:28.047. After that it never had a successful test again. Did a restart of the container but to no avail. I am using the default configuration (no SERVER_ID) and tried also with a specific server ID in case it could have been an issue with the selected server.

Any idea what this could be?

Razoul05 commented 1 year ago

I started having the same issue on June 21 as well. My configuration was set to use a specific server_id and removing that DID NOT change anything. It might be unrelated but I also discovered that the speedtest-cli to my perferred server also reports an error.

I am getting the same error in my docker log as stelgenhof.

Pulling the /metrics directly results in the following:

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 275.0
python_gc_objects_collected_total{generation="1"} 322.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 76.0
python_gc_collections_total{generation="1"} 6.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="0",version="3.10.0"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 2.5546752e+07
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.6965632e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.68782551234e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 1.81
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 9.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP speedtest_server_id Speedtest server ID used to test
# TYPE speedtest_server_id gauge
speedtest_server_id 0.0
# HELP speedtest_jitter_latency_milliseconds Speedtest current Jitter in ms
# TYPE speedtest_jitter_latency_milliseconds gauge
speedtest_jitter_latency_milliseconds 0.0
# HELP speedtest_ping_latency_milliseconds Speedtest current Ping in ms
# TYPE speedtest_ping_latency_milliseconds gauge
speedtest_ping_latency_milliseconds 0.0
# HELP speedtest_download_bits_per_second Speedtest current Download Speed in bit/s
# TYPE speedtest_download_bits_per_second gauge
speedtest_download_bits_per_second 0.0
# HELP speedtest_upload_bits_per_second Speedtest current Upload speed in bits/s
# TYPE speedtest_upload_bits_per_second gauge
speedtest_upload_bits_per_second 0.0
# HELP speedtest_up Speedtest status whether the scrape worked
# TYPE speedtest_up gauge
speedtest_up 0.0
houmark commented 1 year ago

I'm using this container through the internet-pi / internet-monitoring project and I also saw speed tests failing on exactly June 21 and none has completed since. I have restarted the container, the server, etc., without success. I don't have a preferred server set and it should be running pretty much out of the box.

Prior to this, it had been running without issues for ~2 years.

joshua-nord commented 1 year ago

I was seeing this same issue, also starting on June 21. I noticed that my helm chart was using an older version of this docker image. When I updated to use the latest released version from this repo (ghcr.io/miguelndecarvalho/speedtest-exporter:v3.5.3), everything started working normally for me.

stelgenhof commented 1 year ago

Interesting that the issue started on June 21st. Same for me, I had it running for a long time without any changes to the configuration or image.

Razoul05 commented 1 year ago

Following @joshua-nord suggestion I updated my package and it seems to be working now also.

For those who this is a new process like me, I had followed this guide to build my pi: https://github.com/geerlingguy/internet-pi So I ran the following 3 commands to update it:

docker-compose pull             # pulls the latest images
docker-compose up -d --no-deps  # restarts containers with newer images
docker system prune --all       # deletes unused images

From the folder with my docker-compose.yml file (~/internet-monitoring). I expect if you have other dockers installed this would attempt to update them too so use at your own risk.

houmark commented 1 year ago

Following @joshua-nord suggestion I updated my package and it seems to be working now also.

For those who this is a new process like me, I had followed this guide to build my pi: https://github.com/geerlingguy/internet-pi So I ran the following 3 commands to update it:

docker-compose pull             # pulls the latest images
docker-compose up -d --no-deps  # restarts containers with newer images
docker system prune --all       # deletes unused images

From the folder with my docker-compose.yml file (~/internet-monitoring). I expect if you have other dockers installed this would attempt to update them too so use at your own risk.

Nice. Thanks for this @Razoul05, very helpful. I ran these as well just now and it looks like my speed tests are running again. For anyone doing this, bear in mind that you most like have a speed test interval (in my case 30 mins) so you have to wait up to around 30 mins from recreating the docker containers until you'll see a new speed test has run in the UI. You can confirm the image seems updated by doing: docker image history miguelndecarvalho/speedtest-exporter and see that it's the most recent release (about 9 months old at the time of this writing).

stelgenhof commented 1 year ago

Given the fact that the error is an SSL connect error and that for everybody it stopped working on the same day, I highly suspect an expired SSL certificate..

stelgenhof commented 1 year ago

I can also can confirm that pulling the latest image as suggested by @Razoul05 solved the issue for me too. BTW the step docker system prune --all # deletes unused images is not really required (optional).

MiguelNdeCarvalho commented 1 year ago

Hey everyone,

I hope that you got it fixed by now. I'm really curious which version where you running before updating to the latest one. Btw I have release the v3.5.4 right now.

Thanks, MiguelNdeCarvalho

houmark commented 1 year ago

Version numbers seem hard to find in Docker and these dependency setups (for me, at least), but my internet-monitoring setup was installed about two years ago. I'm assuming I was using the version that was released around that time, and since I never did any changes to the docker setup, I guess that same version continued to be in use up until I did the forced update yesterday.

I may try updating again to get the latest version and will report back here if I do so. I guess if it breaks after updating (why would it?) then I can lock in the version in my docker-compose file to the known good version.

Also, @MiguelNdeCarvalho, thanks for this image; it's worked well for me for years, so I appreciate your work on it.

houmark commented 1 year ago

Just following up, I re-upped the image in my docker to the latest release and can confirm that speed tests continue to work smoothly with the most recent release.

stelgenhof commented 1 year ago

Also following that pulling the latest image works and the tests continue to work. Although we don't know exactly the root cause of the error message, updating to the latest version of this image solves the issue.