Open himanshu-kun opened 2 years ago
I have thought of some possible solution: 1) We could enhance the logic of using an existing node in the worker group introduced using this PR to include extended resources also. But this will not solve the case when we don't have any node in any zone of the worker group. 2) We could enhance the nodeTemplate passing through shoot YAML feature (enabled in AWS and Azure as of now) to also pass extended resources. But this approach comes with certain drawbacks:
cc @unmarshall @ashwani2k
Upstream issue worth noticing https://github.com/kubernetes/autoscaler/issues/1869
Specify the node template in the provider config section of the worker. From there, the corresponding extension will pick it up and populate the worker config which contains the NodeTemplate. This will be used to generate the machine class. The CA code at the moment does not consider the ephemeral storage in case of scale from zero.
Inside GetMachineDeploymentNodeTemplate
if len(filteredNodes) > 0 {
klog.V(1).Infof("Nodes already existing in the worker pool %s", workerPool)
baseNode := filteredNodes[0]
klog.V(1).Infof("Worker pool node used to form template is %s and its capacity is cpu: %s, memory:%s", baseNode.Name, baseNode.Status.Capacity.Cpu().String(), baseNode.Status.Capacity.Memory().String())
instance = instanceType{
VCPU: baseNode.Status.Capacity[apiv1.ResourceCPU],
Memory: baseNode.Status.Capacity[apiv1.ResourceMemory],
GPU: baseNode.Status.Capacity[gpu.ResourceNvidiaGPU],
EphemeralStorage: baseNode.Status.Capacity[apiv1.ResourceEphemeralStorage],
PodCount: baseNode.Status.Capacity[apiv1.ResourcePods],
}
} else {
klog.V(1).Infof("Generating node template only using nodeTemplate from MachineClass %s: template resources-> cpu: %s,memory: %s", machineClass.Name, nodeTemplateAttributes.Capacity.Cpu().String(), nodeTemplateAttributes.Capacity.Memory().String())
instance = instanceType{
VCPU: nodeTemplateAttributes.Capacity[apiv1.ResourceCPU],
Memory: nodeTemplateAttributes.Capacity[apiv1.ResourceMemory],
GPU: nodeTemplateAttributes.Capacity["gpu"],
// Numbers pods per node will depends on the CNI used and the maxPods kubelet config, default is often 110
PodCount: resource.MustParse("110"),
}
We need to fix this part to consider ephemeral storage in the else
part. Also, we need to fix the validation of NodeTemplate in the gardener provider extension to explicitly specify ephemeral storage without CPU, GPU, or memory.
What would you like to be added: There should be a mechanism in our autoscaler so that the user could specify any extended resources his nodes have and the autoscaler becomes aware of it, so that during scaling out a node group from zero it could consider that.
Why is this needed: It has been noticed that currently the autoscaler couldn't scale a node group from zero if the pod is requesting an extended resource. This is happening because the nodeTemplate which the autoscaler creates doesn't have the extended resources specified. However it is able to scale from one , because there the autoscaler can form nodeTemplate from an existing node. There are ways in AWS and Azure implementation of autoscaler to specify such resources.