In this PR I've introduced a built-in Prometheus exporter and two exported metrics, that will help us in alerting:
aws_cloud_unmap_up: alert if the app is not running
aws_cloud_unmap_last_reconcile_success_timestamp_seconds: alert if the app is running, but the reconciling is failing (for any reason)
A note about tests: Prometheus client for Python does allow to start_http_server, but not to stop the HTTP server. So in tests I'm not enabling it via --prometheus-enabled because the second execution of main() will fail to bind the Prometheus exporter port (already binded by the first execution of main()), but I'm always setting the internal Prometheus metrics if when Prometheus is not enabled, so that I can assert on them. All in all, the --prometheus-enabled flag just turn on the HTTP server of the built-in exporter.
In this PR I've introduced a built-in Prometheus exporter and two exported metrics, that will help us in alerting:
aws_cloud_unmap_up
: alert if the app is not runningaws_cloud_unmap_last_reconcile_success_timestamp_seconds
: alert if the app is running, but the reconciling is failing (for any reason)A note about tests: Prometheus client for Python does allow to
start_http_server
, but not to stop the HTTP server. So in tests I'm not enabling it via--prometheus-enabled
because the second execution ofmain()
will fail to bind the Prometheus exporter port (already binded by the first execution ofmain()
), but I'm always setting the internal Prometheus metrics if when Prometheus is not enabled, so that I can assert on them. All in all, the--prometheus-enabled
flag just turn on the HTTP server of the built-in exporter.