SuperQ / smokeping_prober

Prometheus style smokeping
Apache License 2.0
554 stars 73 forks source link

Dies if name resolution fails - Request (Tied to issue 68) #133

Open meyerder opened 8 months ago

meyerder commented 8 months ago

I read the following Dies if name resolution fails and the expected is to fast fail.

I understand the aspect of fail fast and to not resolve the names again on a regular basis. The only problem with this is if the Address is cached (which appears to be done on initial load) When you have a Fully Qualified Domain Name that is redundant any change or movement will fail to be tested.

For instance

www.domain.com has a Load Balancer VIP East = 10.10.10.10 (Primary) VIP West = 11.11.11.11 (Secondary)

When you start the test domain.com resolves to 10.10.10.10. A hour later East has a issue and it fails to West. Without resolving the www.domain.com on a somewhat regular basis you will never know that all of your end user traffic is actually going to 11.11.11.11

I know the solution is to use IP Addresses only and test BOTH but this could be a reason to implement https://github.com/SuperQ/smokeping_prober/issues/68

Ideally the metrics would show error or something verse a exit code on the program.

baryluk commented 8 months ago

I would like an option to not die (even on startup), if the resolution fails.

Sometimes one has number of hosts to monitor in smokeping command line, and it is very undersirable to make it fail, and restart, if one of the hosts is gone from DNS forever, or some DNS resolution fails temporarily. Let it run, and mark probes as failed.