Open jonathan-mayer opened 1 month ago
A possible way would be to automatically exclude the workloads managed by ScaledObjects from being scaled by the Downscaler.
My suggestion to implement the approach you said would be something like:
// GetWorkloads gets all workloads of the specified resources for the specified namespaces
func (c client) GetWorkloads(namespaces []string, resourceTypes []string, ctx context.Context) ([]scalable.Workload, error) {
var results []scalable.Workload
if namespaces == nil {
namespaces = append(namespaces, "")
}
for _, namespace := range namespaces {
for _, resourceType := range resourceTypes {
slog.Debug("getting workloads from resource type", "resourceType", resourceType)
getWorkloads, ok := scalable.GetWorkloads[strings.ToLower(resourceType)]
if !ok {
return nil, errResourceNotSupported
}
workloads, err := getWorkloads(namespace, c.clientsets, ctx)
if err != nil {
return nil, fmt.Errorf("failed to get workloads: %w", err)
}
results = append(results, workloads...)
}
}
return results, nil
}
This approach requires the ScaledObjects to be processed before Deployments/StatefulSets
Things to consider: clarity of code (right now is perfect), performance (should still be 0(N), but we will iterate many times inside lists that could be pretty big, especially the Deployment list)
I'm not sure if putting all this in the GetWorkloads function is right. I will have a look on how to implement it next week as I am still out of office for the rest of this week.
Issue
Workloads managed by ScaledObjects are currently also scaled by the Downscaler. This results in the ScaledObject and the downscaler fighting to downscale those workloads to (possibly) different Replicas.
Further details
Bug was found here: https://github.com/caas-team/py-kube-downscaler/issues/96 Bug was fixed in py-kube-downscaler here: https://github.com/caas-team/py-kube-downscaler/pull/100
Proposal
Implement some way to avoid this. A possible way would be to automatically exclude the workloads managed by ScaledObjects from being scaled by the Downscaler.
Who can address the issue
Other links/references