kubernetes / autoscaler

Autoscaling components for Kubernetes
Apache License 2.0
8k stars 3.94k forks source link

cluster-api-autoscaler and enforce-node-group-min-size flag #6549

Closed MaxFedotov closed 6 months ago

MaxFedotov commented 7 months ago

Which component are you using?: cluster-autoscaler

What version of the component are you using?:

Component version: 1.27.4

What did you expect to happen?: As was discussed in https://github.com/kubernetes/autoscaler/issues/5267#issuecomment-1861440996 user can specify an enforce-node-group-min-size for cluster autoscaler to enforce min nodes for the nodegroup.

But this solution won't work for clusterapi, if user specify minNodes = maxNodes, because of this check: https://github.com/kubernetes/autoscaler/blob/a3c8978e119f56a90a6f8406e76685c21cfecd51/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_nodegroup.go#L357-L360

so if, for example, user adds the following annotations to his machineDeployment:

cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "2"
cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "2"

min size won't be enforced

@elmiko, pinging you in this issue because seems to be related to our previous discussion.

elmiko commented 7 months ago

@MaxFedotov and i talked on k8s slack, we think the solution might be to upgrade the current check for creating a node group to ensure that the limits are valid.

code

    // Ensure the node group would have the capacity to scale
    if scalableResource.MaxSize()-scalableResource.MinSize() < 1 {
        return nil, nil
    }

that check occurs in newNodeGroupFromScalableResource, which means that a node group could have capacity but still not be included in the core scaling. given the comment on the code, perhaps this check should be changed to only check that the limits are valid, something like :

    // Ensure the node group has valid scale limits
        // allow MinSize = 0
        // allow MaxSize = MinSize
        // don't allow MaxSize < MinSize
    if scalableResource.MaxSize() - scalableResource.MinSize() < 0 {
                // log the node group being skipped
        return nil, nil
    }
MaxFedotov commented 7 months ago

I'm ready to implement a PR with the fix. What do we need to get the triage accepted mark? :)

elmiko commented 7 months ago

@MaxFedotov i was talking with some other people about this and they were giving me some pushback around whether we should support minSize = maxSize situations.

when we were talking about this as an api, it is possible that the user could remove the min/max annotations and then set the node group to the size they want. my colleague posited that if the intention is to stop scaling on the group for some amount of time, then the proper action is to remove the annotation completely.

i'm curious what you think and in specific about your use case?

MaxFedotov commented 7 months ago

@elmiko, thanks for your answer!

when we were talking about this as an api, it is possible that the user could remove the min/max annotations and then set the node group to the size they want. my colleague posited that if the intention is to stop scaling on the group for some amount of time, then the proper action is to remove the annotation completely. i'm curious what you think and in specific about your use case?

From my point of view, it is much better to have a single system controlling nodegroup scaling and sizing (cluster-autoscaler), than doing it in two different places. It is especially important if you have some custom tooling built on top of cluster-api to manage clusters (and this is our case, we have our own API and tools on top of the cluster-api, which allows us to simplify cluster provisioning for different cloud-providers), because otherwise you lost single responsibility principle and you have to build a lot of if in tooling code.

Also, in this case, enforce-node-group-min-size from cluster-autoscaler won't be working according to its desired and documented behavior for cluster-api provider. I've also checked the code of other major cloud providers in cluster-autoscaler repo and they don't have this kind of logic. So if for example user migrates from classic AWS to cluster-api, he will need to read through the code to understand, that there are some corner cases with custom logic which won't work. As python dzen stated, explicit is better than implicit :)

it is possible that the user could remove the min/max annotations and then set the node group to the size they want.

From my lazy-user perspective, that will require me to do 3 different actions (remove annotations; scale-up; add annotations back if I will need to have scaling in future) instead of simple annotation update :)

elmiko commented 7 months ago

thanks @MaxFedotov , i think we definitely need to do /something/ given the way that --enforece-node-group-min-size works.

if you would indulge me a little longer, i would like to do some similar research that you have regarding looking at the other providers. my gut is telling me that the proper thing for us to do is to allow max=min, but i want to be fully sure before changing the expectation.

MaxFedotov commented 7 months ago

Thanks @elmiko :) will be waiting for results

elmiko commented 7 months ago

hey @MaxFedotov , still looking at this, also still think we should probably go with your suggestion.

MaxFedotov commented 7 months ago

@elmiko got it, thanks! Let me know your final decision and if we will proceed I will prepare a PR.

elmiko commented 6 months ago

hey @MaxFedotov , been studying the code more. i've put an item on the agenda for monday's sig call just to gather opinions from others. i still haven't found anything that makes we think we shouldn't do this, but you know due diligence and all lol.

MaxFedotov commented 6 months ago

Thanks @elmiko! Hope everything goes well and this change will be supported by everybody.

elmiko commented 6 months ago

talked about this with @gjtempleton at the sig meeting today, curious to see if this works as expected on aws (with the min size flag).

@MaxFedotov i think it makes sense to prepare the patch for this, i think we want to capture the logic where "minSize = maxSize when maxSize > 0", we should process that node group.

MaxFedotov commented 6 months ago

@elmiko Great! The only question I have - why not use the previous proposal? I think it will cover all cases, or am I missing something?

    // Ensure the node group has valid scale limits
        // allow MinSize = 0
        // allow MaxSize = MinSize
        // don't allow MaxSize < MinSize
    if scalableResource.MaxSize() - scalableResource.MinSize() < 0 {
                // log the node group being skipped
        return nil, nil
    }
elmiko commented 6 months ago

i think that if maxSize == 0 then we should not process the node group. i think we just need to update the example like this:

    // Ensure the node group has valid scale limits
        // allow MinSize = 0
        // allow MaxSize = MinSize
        // don't allow MaxSize < MinSize
        // don't allow MaxSize = MinSize = 0
    if scalableResource.MaxSize() - scalableResource.MinSize() < 0 || scalableResource.MaxSize() == 0 {
                // log the node group being skipped
        return nil, nil
    }

edit: oops, messed up the logic on the first try XD

MaxFedotov commented 6 months ago

Thanks, created a PR :)