teralytics / prometheus-ecs-discovery

A Prometheus discoverer that scrapes Amazon ECS and a generates file SD configuration file.
Apache License 2.0
260 stars 157 forks source link

Service exits when fails to call ecs service #73

Open Miquido-Devops opened 3 years ago

Miquido-Devops commented 3 years ago

I've encountered a problem

2021/07/12 06:43:18 RequestError: send request failed
caused by: Post https://ecs.eu-west-1.amazonaws.com/: dial tcp 52.95.116.181:443: connect: connection refused

Which caused the service to exit. I believe that RequestError could be caught and handled so the service does not close when AWS ECS is temporarily unavailable

Rudd-O commented 2 years ago

This should be doable with perpetual retries, but the last connection success timestamp should be posted as a metric so the user can query and alert on whether the discovery data is stale.