gluster / gluster-prometheus

Gluster monitoring using Prometheus
GNU Lesser General Public License v2.1
119 stars 70 forks source link

Status metrics for gluster daemon #85

Open cloudbehl opened 5 years ago

cloudbehl commented 5 years ago
  1. gluster_prometheus_up
  2. gluster_server_up
  3. gluster_gd2_up
  4. gluster_csi_up
  5. gluster_block_csi_up
  6. gluster_block_up
  7. gluster_operator_up
shtripat commented 5 years ago

@cloudbehl I feel the one named gluster_promethues_up should rather be called gluster_exporter_up

cloudbehl commented 5 years ago

@cloudbehl I feel the one named gluster_promethues_up should rather be called gluster_exporter_up

Ack!

shtripat commented 5 years ago

IIUC we can do the below things for marking the running status of services

To mark these services down, we can have recording rules written in prometheus which checks if the status of these services updated for last (say 30 mins). If not updated for last 30 mins, mark the service status as down.

@JohnStrunk @aravindavk @Madhu-1 does this make sense and good to go ahead this way?

Madhu-1 commented 5 years ago

gluster_csi - @Madhu-1 is there some kind of REST call using which we can make out if the CSI driver is up or not. If not do we have to run ps command to get the running status?

CSI driver does not provide REST calls but it does provide RPC calls, you can check the status by sending probe request. but I personally do not prefer to do this way, instead of that you can make Kube call and check the pod status (but this also not ensure that CSI driver is healthy or not :( )

aravindavk commented 5 years ago

gluster_gd2 - Make a REST call http://{IP}:24007/v1/hello and if it returns 200 then set the value gluster_gd2_up = 1

metrics_ps plugin already returning glusterd2's status, can upstate derived from that? https://github.com/gluster/gluster-prometheus/blob/master/gluster-exporter/metric_ps.go#L18

Alternatively glusterd2 has ping API(GET http://{IP}:24007/ping)

gluster_csi - @Madhu-1 is there some kind of REST call using which we can make out if the CSI driver is up or not. If not do we have to run ps command to get the running status?

csi driver process will not run in same pod. ps command can't be used. Also gluster-exporter need not export CSI metrics.

gluster_block_csi - @Madhu-1 is there some kind of REST call using which we can make out if the CSI driver is up or not. If not do we have to run ps command to get the running status?

Same as above. ps command is not useful.

gluster_block - @Madhu-1 is there some kind of REST call using which we can make out if the CSI driver is up or not. If not do we have to run ps command to get the running status?

Again ps command is not useful. But https://github.com/gluster/gluster-block-restapi project can expose REST API to provide health details.