metanull-operator / eth2-grafana

Grafana dashboards for Eth2.
68 stars 16 forks source link

CPU / Ping charts go back to "No Data" on refresh #3

Closed fireheadman closed 3 years ago

fireheadman commented 3 years ago

Liking the dashboard!... I'm on Ubuntu 20 also. Followed the guide to a "T" but getting some off behaviors:

Unsure why, but when the CPU and Ping charts refresh, it goes back to "No Data" The CPU Temp I am unsure if it will register since I am on a VM ?

Also, I am not seeing anything in the Local Validators

image

image

UPDATE: for this post only.... Was able to resolve this by editing the prometheus.yml and changing all the "127.0.0.1" references to "localhost" and restarting prometheus. Now the CPU and Ping charts are persistent on refresh cycles.

Also, my Hourly Return just started to pop in, however nothing else is populating. Was expecting Hourly Earnings to show by now. image

fireheadman commented 3 years ago

Here is another issue I am seeing Tiny print on the Participation Rate / Network Liveness and Huge Red message in the Avg.Balance

image

fireheadman commented 3 years ago

This might be a timing issue after all....?? I noticed at the top right, the time range selector was at "Last 2 days". I moved this to "Last 15 minutes" and the graphs look much better!

NOTE: Still looking into the CPU Temp, however I may just nuke this graph and make room for integrating the ALERTING metrics from the EthStakers grafana dashboard UPDATE: From what I have seen... FreeNas does not pass CPU Temp to VMs, so I cannot monitor this metric. image

image

fireheadman commented 3 years ago

This can be closed out.... Was also able to create the alerting metrics from the EthStakers Dashboard

I will continue looking for why the DISK USAGE metric is not properly reporting, looks like node_exporter needs to be tuned to pick up the "/" root filesystem only.

image

fireheadman commented 3 years ago

found the issue on the filesystem monitoring not reporting... this may help others, just a matter of tuning to your system (as you mentioned) For the Query (sum(node_filesystem_size_bytes{device="/dev/mapper/ubuntu--vg-ubuntu--lv",job="node_exporter"})-sum(node_filesystem_avail_bytes{device="/dev/mapper/ubuntu--vg-ubuntu--lv",job="node_exporter"}))/sum(node_filesystem_size_bytes{device="/dev/mapper/ubuntu--vg-ubuntu--lv",job="node_exporter"})*100

Then add an alert image image

image

fireheadman@eth01:~$ df -hP /
Filesystem                         Size  Used Avail Use% Mounted on
/dev/mapper/ubuntu--vg-ubuntu--lv  491G  329G  141G  71% /

End results: image