Closed ibuziuk closed 5 years ago
cc: @fche
@ibuziuk just to clarify, is the only deliverable still outstanding in this PR the outgoing relay of some metrics to zabbix?
@fche correct, the only thing that is missing is exposing the metrics to zabbix
Not sure if this issue is relevant anymore since we've deployed a prometheus instance specific to rhche
Clsoing
rhche-host
service on dsaas / dsaas-stg exposes 8087 port for obtaining metrics in Prometheus format:rhche-host ClusterIP 172.30.149.180 <none> 8080/TCP,8087/TCP 52d
Currently it is possible to obtain (ClassLoader / JVM / Tomcat) metrics from osd monitor via service name & port combo. e.g
curl rhche-host:8087
:Those metrics need to be consumed & visualized by osd monitor + exposed to zabbix. Currently the most important metrics are the following:
Number of live threads is currently by far the most important metric since it would allow to investigate P1 issue - https://github.com/openshiftio/openshift.io/issues/4626