Open lemoer opened 7 years ago
Thank you! I would like to "solve" this issue with this information:
I haven't looked at all the references in deep yet, but it looks good.
Have you noticed the difference between cat /proc/stat | grep processes
(which is the number of forks since the system has booted) and the number of running processes?
Topic design:
/cpu/idle
/user (=user + nice)
/system
/iowait
/irq
/softirq
/steal (= steal + guest + guest_nice)
/mem/used
/free
/buffers
/cached
/swapused
/swapfree
/processes/forkrate
/ctxtrate
/tasks
/threads
/kthreads
/zombies
(Gonna attach some comments later)
more work for you:
Some metrics are already implemented:
I think those can be analysed using bpfcountd:
Those are missing and easy to implement:
And those are not that easy to implement (i.e. I don't have a solution that is not a quick'n'dirty one):
Very nice. I think the cpu of fastd is very important, but we should yield up trying to count the successful dhcp replies. I think its really too complicated.
Hmm, if we just publish the stats of each cpu core separately, we should get an quite accurate idea how much cpu is consumed by fastd. Don't you think so?
I'm not sure, because the fastd instance seems to hop a lot on multi cpu systems.
I see. Gonna find something better to meter cpu usage.
Maybe this excerpt from man 5 proc
helps:?
(14) utime %lu
Amount of time that this process has been
scheduled in user mode, measured in clock
ticks (divide by sysconf(_SC_CLK_TCK)). This
includes guest time, guest_time (time spent
running a virtual CPU, see below), so that
applications that are not aware of the guest
time field do not lose that time from their
calculations.
(15) stime %lu
Amount of time that this process has been
scheduled in kernel mode, measured in clock
ticks (divide by sysconf(_SC_CLK_TCK)).
(16) cutime %ld
Amount of time that this process's waited-for
children have been scheduled in user mode,
measured in clock ticks (divide by
sysconf(_SC_CLK_TCK)). (See also times(2).)
This includes guest time, cguest_time (time
spent running a virtual CPU, see below).
(17) cstime %ld
Amount of time that this process's waited-for
children have been scheduled in kernel mode,
measured in clock ticks (divide by
sysconf(_SC_CLK_TCK)).
Yes sure. I'm familiar with that ;)
But I don't know how to project this information in mqtt topics. I would integrate this feature in the processes plugin. The trivial projection would be:
org/example/processes/fastd/cpu
org/example/processes/fastd/mem
But what to do if multiple fastd instances are running? Add the PID?
org/example/processes/fastd/1234/cpu
org/example/processes/fastd/1234/mem
But this would be complicated to put in a graph. Wouldn't it?
You got an idea how to solve this?
I also think, that the pid is not the right way.
Another idea:
Maybe we could solve this by doing it for the systemd services instead of processes. But since there is no 1:1 mapping between processes and services, i would say that we should aggregate all memory and mem statistics. This would be nice for services, which fork childrens.
This reveals the service name:
cat /proc/$PID/cgroup | grep name=systemd
Maybe other interesting things:
cat /proc/stat | grep processes