Open Niha-2612 opened 3 months ago
Hi @danielm0hr , @jan--f
Really appreciate if you can please have a look into this ?
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Proposal
We have a Z-specific operator that can collect the ifl-related metrics using the hyptop on the hypervisor (z/vm or KVM). It helps facilitate the system's cost management.
For KVM: We have a service running on the hypervisor, with proper network configurations connecting to the Openshift cluster running on top, exposing the hyptop data. Next, the hyptop data is monitored by the default Prometheus.
For ZVM: The service monitor runs on every node of the cluster and exposes the hyptop data to Prometheus.
Example Metric and screen shot: ifl_usage:
metric values:
Our thought is whether these can be integrated with the existing prometheus-based cluster monitoring stack that is deployed on OpenShift. We would like to know the inputs from the folks!