kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.58k stars 1.32k forks source link

Add option to disable NodeStartupTimeout on MachineHealthCheck #4468

Closed JoelSpeed closed 3 years ago

JoelSpeed commented 3 years ago

User Story

As a user I would like to be able to specify a healthcheck that does not have a node startup timeout for so that I can handle complex healthcheck logic separately.

As a cluster fleet administrator, I would like to create MachineHealthChecks that are suitable for all clusters, regardless of platform. Since I cannot know what a reasonable NodeStartupTimeout is, I would like to disable this feature and allow my cluster admins to opt in by creating their own MHCs where appropriate.

Detailed Description

If a MachineHealthCheck can not cater for your health check needs, you may want to handle this health check somewhere else. For security reasons, you don't want to give this alternate health checker permission to delete Machine resources, but can give it permission to add a condition to a Node (eg. by using the kubelet credentials for the node it is operating on).

In this scenario, it is not suitable for a NodeStartupTimeout to apply. The MachineHealthCheck should react only to the signal from the custom health check component itself. Currently there is no way to disable the NodeStartupTimeout.

I think there are valid use cases for where you may want to just use the condition checks of MachineHealthCheck without the NodeStartupTimeout.

Anything else you would like to add:

I think based on the existing field, a sensible approach to setting a disabled value is to have an explicit option of -1 to disable the feature. We could use zero, however this is the zero value of the go type and so would be indistinguishable from an unset value, and hence would be defaulted.

/kind feature

vincepri commented 3 years ago

NodeStartupTimeout seems to be a metav1.Duration, we could distinguish between nil and 0s, although that makes defaulting a bit complicated. Looking at the defaulting code we have today, it seems that if the value is nil we default it to 10 minutes.

The current minimum we require is 30s, although we could allow folks to set it to 0s instead?

/area api

vincepri commented 3 years ago

/area health

JoelSpeed commented 3 years ago

I was having a play before with this and couldn't get it to work with 0, but just had another play and got it working. Though because a metav1.Duration is a string, you have to put it in quotes, so it looks like nodeStartupTimeout: "0".

I'm happy with either, though, I am wondering if the -1 value is a more obvious disable string, wondering if 0 could be interpreted in a different way?

sbueringer commented 3 years ago

~Semantically "0" could be a bit problematic. I think it could imply that startup timeout is actually "0"~

Although I guess there is precedent. E.g. in go http client if you don't set a timeout, it's 0 and 0 means indefinite, which is - I guess - what we want.

But I think it's still important that we shouldn't implicitly set an indefinite timeout and this should be chosen intentionally. It could be a bit strange for users to set the timeout to the Go default value of duration and then get a different result then when it's unset.

So I guess I'm pro-"let's use -1".

vincepri commented 3 years ago

Most of the Kubernetes APIs I've seen have 0 to default to unlimited / disabled behaviors. I wouldn't want to necessarily introduce yet another value between nil or 0

sbueringer commented 3 years ago

Most of the Kubernetes APIs I've seen have 0 to default to unlimited / disabled behaviors. I wouldn't want to necessarily introduce yet another value between nil or 0

Do they also have different behaviour regarding explicitly set to zero value vs unset?

EDIT: Ah sorry missed that we have a duration pointer...

vincepri commented 3 years ago

Do they also have different behaviour regarding explicitly set to zero value vs unset?

No, right now if it's nil, we default it to 10m and the minimum is 30s. We can allow folks to set it to 0s and disable it that way.

enxebre commented 3 years ago

using 0s sounds reasonable to me.

In this scenario, it is not suitable for a NodeStartupTimeout to apply. The MachineHealthCheck should react only to the signal from the custom health check component itself. Currently there is no way to disable the NodeStartupTimeout.

@JoelSpeed what's the relation you seeing between having a custom health check and having an infinite NodeStartupTimeout? A NodeStartupTimeout would be reasonable regardless of what/how deletion was triggered right?

JoelSpeed commented 3 years ago

@enxebre in particular, the scenario we are looking at is spot instances. For security reasons we use MHC as a proxy to get deletion and draining. So we want an MHC that will react to our spot termination condition, but not react to startup timeout. It's acceptable for the system as a whole to react to these system initiated events, but users may not want the startup timeout to be in place in this scenario.

By allowing it to be disabled, we can allow for condition based health checks that can be applied to all platforms without fear of the node startup timeout being an unfavorable value for some of the platforms (eg AWS vs Baremetal)

JoelSpeed commented 3 years ago

@michaelgugino just raised a question on the PR which warrants further discussion. (https://github.com/kubernetes-sigs/cluster-api/pull/4471#discussion_r612508811)

Should we consider instead of making the option default to 10m, default to off with the reasoning that a user should have to opt in to the node startup timeout behaviour. I can see merit in this approach but am interested to know what others think. (CC @n1r1 @beekhof)

vincepri commented 3 years ago

Should we consider instead of making the option default to 10m, default to off with the reasoning that a user should have to opt in to the node startup timeout behaviour. I can see merit in this approach but am interested to know what others think. (CC @n1r1 @beekhof)

Let's please consider this in another issue. This is a breaking behavioral change, and we might not be able to make it for v0.4.0 at this point.

n1r1 commented 3 years ago

Should we consider instead of making the option default to 10m, default to off with the reasoning that a user should have to opt in to the node startup timeout behaviour. I can see merit in this approach but am interested to know what others think.

+1 for default to off. It makes sense.

vincepri commented 3 years ago

/milestone v0.4