grafana / jsonnet-libs

Grafana Labs' Jsonnet libraries
Other
625 stars 160 forks source link

Add GCP Compute Engine dashboard #1336

Closed anaivanov closed 1 month ago

anaivanov commented 1 month ago

Add GCP Compute Engine dashboard

yduartep commented 1 month ago

I don't know if there is something wrong with the table Instances by the values are not being rendered:

Screenshot 2024-09-23 at 18 52 01
yduartep commented 1 month ago

On this panel, you have defined unit=short but then on the override cell, you have defined. unit=/s.

Screenshot 2024-09-23 at 18 56 25
yduartep commented 1 month ago

On the second query of this panel, you are using a wrong query. It should be stackdriver_gce_instance_compute_googleapis_com_instance_network_received_bytes_count. For that reason the Received in the legend is not shown.

Screenshot 2024-09-23 at 18 58 58
yduartep commented 1 month ago

If I use all the filters on this query, the graph is correctly shown. Now says No data.

Screenshot 2024-09-23 at 19 02 06

Same with the Network sent panel.

Screenshot 2024-09-23 at 19 03 02
yduartep commented 1 month ago

Why don't you show Network Received and Network sent on the same line and show CPU Usage time and Cpu utilization on the same line? Same with Count on DIsk red/write and operations read/write. I think people use to read horizontally not vertically.

Screenshot 2024-09-23 at 19 03 41
yduartep commented 1 month ago

Hi Ana, I am not sure if using stackdriver_gce_instance_compute_googleapis_com_instance_cpu_utilization metric is the correct way to measure the number of instances. I am afraid, if for any case that metric is not sent to Grafana, you will have 0 instances. I wonder if is better to use stackdriver_gce_instance_compute_googleapis_com_instance_uptime. Obviously can happen the same but I would check with somebody from the backend team which metric is better to use. You could do something like:

count(sum by (instance_name) (stackdriver_gce_instance_compute_googleapis_com_instance_uptime{job!="",job=~"$job",project_id=~"$project_id"}))
yduartep commented 1 month ago

I am not sure if the System problem count panel has the correct query. If I breakdown the number of problems by instance in a table in the last moment, the sum of the errors of all of them is more than 200. I think you are displaying the number of instances that have errors not the number of errors. Exactly what do you want to measure here?

Screenshot 2024-09-24 at 13 59 39
anaivanov commented 1 month ago

Thank you all for the feedback! I learned a lot from you while learning this library!