vesoft-inc / nebula-dashboard

Nebula Graph Service Monitor Tool
Apache License 2.0
28 stars 20 forks source link

List our the raw metrics and promql being used in our dashboard #274

Closed wenhaocs closed 1 year ago

wenhaocs commented 1 year ago

Our use is asking if we can provide all the Promql we are using for node level metrics, and the corresponding raw metrics from node_exporter. The reason of the latter is because they will whitelist those raw metrics and do not need to send everything to grafana.

xigongdaEricyang commented 1 year ago
metric name promql description
cpu_utilization 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle",nebula_cluster="1"}[1m])) * 100) The percentage of used CPU.
cpu_idle avg by (instance) (irate(node_cpu_seconds_total{mode="idle",nebula_cluster="1"}[1m])) * 100 The percentage of idled CPU.
cpu_io_wait_used avg by (instance) (irate(node_cpu_seconds_total{mode="iowait",nebula_cluster="1"}[1m])) * 100 The percentage of CPU waiting for IO operations.
cpu_user_used avg by (instance) (irate(node_cpu_seconds_total{mode="user",nebula_cluster="1"}[1m])) * 100 The percentage of CPU used by users.
cpu_system_used avg by (instance) (irate(node_cpu_seconds_total{mode="system",nebula_cluster="1"}[1m])) * 100 The percentage of CPU used by the system.
memory_used_utilization ((node_memory_MemTotal_bytes{nebula_cluster="1"} - node_memory_MemAvailable_bytes{nebula_cluster="1"}) / node_memory_MemTotal_bytes{nebula_cluster="1"} )* 100 The percentage of used memory.
memory_avaliable_utilization (node_memory_MemAvailable_bytes{nebula_cluster="1"} / node_memory_MemTotal_bytes{nebula_cluster="1"} )* 100 The percentage of avaliable memory.
memory_cached_utilization (node_memory_Buffers_bytes{nebula_cluster="1"} + node_memory_Cached_bytes{nebula_cluster="1"}) / node_memory_MemTotal_bytes{nebula_cluster="1"} * 100 The percentage of used cached
memory_swap_utilization (node_memory_SwapTotal_bytes{nebula_cluster="1"} - node_memory_SwapFree_bytes{nebula_cluster="1"}) / node_memory_MemTotal_bytes{nebula_cluster="1"} * 100 The percentage of used swap
memory_total node_memory_MemTotal_bytes{nebula_cluster="1"} The total memory
memory_used node_memory_MemTotal_bytes{nebula_cluster="1"} - node_memory_MemAvailable_bytes{nebula_cluster="1"} The memory space used
memory_avaliable node_memory_MemAvailable_bytes{nebula_cluster="1"} The memory space available
memory_cached node_memory_Buffers_bytes{nebula_cluster="1"} + node_memory_Cached_bytes{nebula_cluster="1"} The memory space used by cache and buffer
memory_swap_used node_memory_SwapTotal_bytes{nebula_cluster="1"} - node_memory_SwapFree_bytes{nebula_cluster="1"} The swap used
memory_swap_total node_memory_SwapTotal_bytes{nebula_cluster="1"} The total memory space of the swap
load_1 node_load1{nebula_cluster="1"} The average load of the system in the last 1 minute
load_5 node_load5{nebula_cluster="1"} The average load of the system in the last 5 minutes
load_15 node_load15{nebula_cluster="1"} The average load of the system in the last 15 minutes
disk_used sum(node_filesystem_size_bytes{nebula_cluster="1"} - node_filesystem_free_bytes{nebula_cluster="1"}) by (device,instance) The disk space used
disk_free sum(node_filesystem_avail_bytes{nebula_cluster="1"}) by (device,instance) The disk space available
disk_readbytes irate(node_disk_read_bytes_total{nebula_cluster="1"}[1m]) The number of bytes that the system reads in the disk per second
disk_writebytes irate(node_disk_written_bytes_total{nebula_cluster="1"}[1m]) The number of bytes that the system writes in the disk per second
disk_readiops irate(node_disk_reads_completed_total{nebula_cluster="1"}[1m]) The number of read queries that the disk receives per second
disk_writeiops irate(node_disk_writes_completed_total{nebula_cluster="1"}[1m]) The number of write queries that the disk receives per second
inode_utilization (1- (node_filesystem_files_free{nebula_cluster="1"}) / (node_filesystem_files{mountpoint="/",fstype!="rootfs",nebula_cluster="1"})) * 100 The percentage of used inode
disk_size node_filesystem_size_bytes{nebula_cluster="1"} disk size
root_fs_used_percentage 100 - ((node_filesystem_avail_bytes{fstype!="rootfs",nebula_cluster="1"} * 100) / node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs",nebula_cluster="1"}) root fs used utilization
network_in_rate ceil(sum by(instance)(irate(node_network_receive_bytes_total{device=~"(eth\|en)[a-z0-9]*",nebula_cluster="1"}[1m]))) network in
network_out_rate ceil(sum by(instance)(irate(node_network_transmit_bytes_total{device=~"(eth\|en)[a-z0-9]*",nebula_cluster="1"}[1m]))) network out
network_in_errs ceil(sum by(instance)(irate(node_network_receive_errs_total{device=~"(eth\|en)[a-z0-9]*",nebula_cluster="1"}[1m]))) netowrk in errors
network_out_errs ceil(sum by(instance)(irate(node_network_transmit_errs_total{device=~"(eth\|en)[a-z0-9]*",nebula_cluster="1"}[1m]))) netowrk out errors
network_in_packets ceil(sum by(instance)(irate(node_network_receive_packets_total{device=~"(eth\|en)[a-z0-9]*",nebula_cluster="1"}[1m]))) netowrk in packets
network_out_packets ceil(sum by(instance)(irate(node_network_transmit_packets_total{device=~"(eth\|en)[a-z0-9]*",nebula_cluster="1"}[1m]))) netowrk out packets
open_file_desc node_filefd_allocated{nebula_cluster="1"} open file desc
context_switch_rate irate(node_context_switches_total{nebula_cluster="1"}[30s]) context switch
disk_used_percentage (node_filesystem_size_bytes{nebula_cluster="1"}-node_filesystem_free_bytes{nebula_cluster="1"}) *100/(node_filesystem_avail_bytes{nebula_cluster="1"} +(node_filesystem_size_bytes{nebula_cluster="1"}-node_filesystem_free_bytes{nebula_cluster="1"})) The percentage of disk used.