Open prabhu-mns opened 9 months ago
Hello! Thank you for filing an issue.
The maintainers will triage your issue shortly.
In the meantime, please take a look at the troubleshooting guide for bug reports.
If this is a feature request, please review our contribution guidelines.
Hi, Any feedback on this would be appreciated as this is impacting our Production workloads.
Checks
Controller Version
0.26.0
Helm Chart Version
0.21.1
CertManager Version
No response
Deployment Method
Helm
cert-manager installation
Yes
Checks
Resource Definitions
To Reproduce
Describe the bug
Pod count goes beyond the desired / max value defined in HRA. Because of this, each runner deployment pods has > 100 pods approx. which tends to autoscale more no. of nodes just because the node reaches the max pod limit though enough CPU, and memory resources are available, and when max node count is reached, pods are not able to scheduled and it goes to pending state which impacts the new workflow runs
Describe the expected behavior
Pod count stays within the desired, max value defined in the HRA
Whole Controller Logs
Whole Runner Pod Logs
Additional Context
No response