Closed zoechi closed 1 week ago
It took 15 hours from starting the restic stats
command until it starts serving on the configured address+port.
I looked a bit at the code and it looks like a pure Python subprocess.run
issue, but I have no experience with Python.
Even when it runs such a long time it seems to complete successfully in the end otherwise an exception would be thrown.
Sep 13 22:07:35 smarthome prometheus-restic-exporter-start[1022]: 2024-09-13 22:07:35 INFO Starting Restic Prometheus Exporter
Sep 13 22:07:35 smarthome prometheus-restic-exporter-start[1022]: 2024-09-13 22:07:35 INFO It could take a while if the repository is remote
Sep 14 13:07:27 smarthome prometheus-restic-exporter-start[1022]: 2024-09-14 13:07:27 INFO Serving at http://0.0.0.0:9753
Sep 14 13:07:27 smarthome prometheus-restic-exporter-start[1022]: 2024-09-14 13:07:27 INFO Refreshing stats every 3600 seconds
Perhaps some more log output about what it does could help in case it does more that I think (multiple calls). From what I observed it's always the same process and not one call per snapshot or whatever. The hash is always the same when I check the process list.
I think now that this is a SystemD issue which limits resources for services. I'm going to close but will update when I figure out how to fix this.
I found https://github.com/restic/restic/issues/4591
I ensured the RESTIC_CACHE_DIR
env var is set to the same directory as when restic backups are executed. When I run the command manually, the cache directory's changed timestamp is updated, when it's run from the exporter in the systemd service it is not updated.
So it might be the missing cache dir but it's not clear yet why.
The cause was that the cache directory was read only.
This was caused by NixOS setting ProtectSystem=strict
that is set for all Prometheus exporter's systemd services.
I back to Hetzner using the
sftp.command
(#31). So for restic-exporter to work I mount the storage to a directory.shows this with no further output
shows
When I execute the command manually after about 15sec I get
Any idea why the command might be stuck when run from the exporter and why the exporter never starts listening on the configured port?
Any suggestions about how to debug?