Open alin-at-dfinity opened 2 years ago
Oh, almost forgot: #430 also switched process_start_time_seconds
from a Gauge
to an IntGauge
. While all other changes in that PR were reasonable (file descriptors, memory usage), process_start_time_seconds
is also not an integer. Linux (at least) provides the process start time as a floating point value and truncating it to an integer discards useful information (e.g. if I want to know the order in which two processes on the same machine got started; or how long it took to start one vs. the other).
This is more of a nice to have as compared to process_cpu_seconds_total
above, but still useful.
430 switched
process_cpu_seconds_total
from an (f64
)Counter
to anIntCounter
, for performance reasons.Issue is that processes that don't use a lot of CPU will now have 1 second CPU usage spikes every so often instead of a more or less flat usage with spikes occurring when they actually do, not when 1 second of CPU usage has accumulated. I.e. this has turned the metric from a smooth growing floating point value into a step function
E.g. a process using 1% CPU on average, with a 10% spike every minute will now look as if it is using 0% CPU most of the time and spike every 2 minutes out of 3.
Considering that most processes export hundreds if not thousands of metrics, I really don't think the marginal performance improvement is worth this regression. Please keep using Counter for cpu_total.
Happy to make the changes myself, if they're likely to be accepted.