Open brndnmtthws opened 6 years ago
@csbell had asked me to to file an Issue about being able to configure health check timeouts. You beat me to it, Brenden.
I can see 2 aspects to this:
Both are valuable and we should fix both, but am curious which bucket do you think you fall into? Also will be great if you can share some more context on why the current defaults dont work for you. Scenarios in which they break. Are they too high/short? Thanks!
In my case it's the first issue. The defaults that GCP provides are better IMO (which is a 5 second interval with 2 up and 2 down). I don't have anything special other than I think 600 seconds (10 minutes) is far too long for detecting failures (which is how long 10 failures takes to detect with a 60s interval). With the GCP default it'll only take about 10s.
+1 for configuring health checks. Ideally it would just work based on the underlying service/pod’s health checks. IIRC this is how the normal gce ingress works as well as the federation ingress (though I could be wrong)
+1 600 seconds it's a long time. +1 honouring underlying service/pod’s health checks
Is it possible to modify the healthcheck probe to use TCP vs. HTTP request?
The default healthcheck settings aren't optimal, and I have to edit the source code to change them, or change them manually through the API. It'd be nice to be able to configure the health checks so I can use more reasonable values (5s interval, with 2 checks to change state).