defichain-api / masternode-health-server

Server part to send data to the masternode-health-api
MIT License
2 stars 4 forks source link

Add "load_max" #24

Closed DerFuchs closed 3 years ago

DerFuchs commented 3 years ago

Besides the "load_avg" value, there should also be a "load_max" implemented. This enables the user to calculate the whole relative system load.

sandrich commented 3 years ago

load_average is already showing the relative system load. How do you define max when it comes to load?

DerFuchs commented 3 years ago

Load can be > 1. Unix load numbers are adding up by CPU cores. If you're running 8 cores, the max load is 8.00. If 3 of 8 cores are running on full load, the corresponding load number will be something around 3.00.

What load average from 'top' shows you, is the average load for the last minute, last 5 minutes and the last 15 minutes (https://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html)

I don't know what psutil delivers but I assume that it's something similar.

sandrich commented 3 years ago

That is not quite correct. Load is much more than cores but as a rule of thumb on 8 cores a load average of 8 might be loaded. But not at all this means it is at max load. As an example I run servers at work that sometimes have an average load average of 4 per core which for a short period of time is just fine. Then also 1 core is not 1 core depending if you talking hypterthreading or not.

psutil returns the load average and we report the 5' load.

I personally find load based alert difficult as high load does not mean the service is affected. Something different if you measure the service if it runs as it should or not. Another option is to compare the load to previous loads and if it changes by some factor then maybe you could flag it.

DerFuchs commented 3 years ago

Yep, thanks for clarification. What I am aiming for is a visualized load using a bar:

image

A very high load might not be caused by the DeFiChain service but it's an indicator for some not that healthy conditions. IDK if we need that "uncommon high load" flag. What I want to show on this bar is an indicator of how busy the whole system is. Therefore my idea was to set it in perspective by comparing to it's max load.

sandrich commented 3 years ago

I see. Looks pretty nice. Then I suggest I upload also the amount of cores the system has? Maybe load x cores x 1.5 could be the max

DerFuchs commented 3 years ago

I see. Looks pretty nice. Then I suggest I upload also the amount of cores the system has? Maybe load x cores x 1.5 could be the max

Good idea! The amount of cores is perfectly fine 😎👍

sandrich commented 3 years ago

@adrian-schnell let me know what the endpoint is so I can send you the amount of cores.

adrian-schnell commented 3 years ago

@sandrich send it to the server-stats endpoint as num_cores: int

as soon as it's sent, I'll require it in the validation.

sandrich commented 3 years ago

How do you like the alternative display in verbose? All the data ins reported individually of course

Screenshot 2021-08-31 at 21 04 37