Open verdel opened 6 months ago
Hello! Thank you for filing an issue.
The maintainers will triage your issue shortly.
In the meantime, please take a look at the troubleshooting guide for bug reports.
If this is a feature request, please review our contribution guidelines.
Hey @verdel,
You are right, we receive an empty batch if no activity is needed, so the metric would be incorrect when the cluster becomes idle. Ideally, to reflect the correct metric, the changes should be made on the API side. However, we can optimistically set this metric to the desired count when the cluster becomes idle. Let me discuss it with the team, and I'll get back to you with more information :relaxed:
Hello team, we would also be interested in this fix. Do you have an update on this by any chance?
Not sure if this is exactly related, but we have issue with the gha_idle_runners
metrics that stays at zero even if the ui shows 20+ runners idle. Would be nice to have this metric working to be able to set some alerts.
I'm observing the same thing. The value of gha_idle_runners
seems to always be 0 even with many idle runners.
We're facing issues to monitor our runners due to this, and it have been open for a while
Checks
Controller Version
0.9.2
Deployment Method
Helm
Checks
To Reproduce
Describe the bug
We have a GitHub Action that runs once a day. A special type of runners is allocated specifically for it. During the execution of the GitHub Action, we receive the latest batch of messages about task execution. In this message, the
statistics.totalIdleRunners
andstatistics.totalRegisteredRunners
contain non-zero values.These values are published by the controller as a prometheus metrics. After this last message, the metric values do not change until the next runner execution the following day.
Is it possible to fix this behavior, or does it require changes on the GitHub side?
Describe the expected behavior
The value of the Prometheus metrics ghalistener should reflect the actual state of the runners.
Additional Context
Controller Logs
Runner Pod Logs