Get CPU usage per process

inglor commented 7 years ago

How can I get the CPU usage per process? I want to get the current CPU usage of top 5 CPU processes. Through the API I got the 5 top CPU processes but I can't see a way to get the CPU usage of each. How does top calculate it?

OSProcess[] processes = new SystemInfo().getOperatingSystem().getProcesses(5, OperatingSystem.ProcessSort.CPU);
    for (OSProcess process : processes) {
      ...
    }

dbwiddis commented 7 years ago

You'll have to calculate it from the "ticks" available much as you would count the overall CPU usage of the processor (although there are convenience methods there!)

The OSProcess object provides you active time on the process in getKernelTime() and getUserTime(). You can add these together and divide by getUpTime() to get CPU average since the process started.

An example of this calculation is in the OSProcess Comparator for CPU usage.

If you want a "recent" CPU usage you'll have to record those times at a time interval, then grab them a second time and do the same calculation on the time difference.

ludgart commented 7 years ago

Hey dbwiddis, do you have a example of the recent cpu calculation? Thanks!

dbwiddis commented 7 years ago

There is an example of overall system recent CPU in the SystemInfoTest file. So you would do something similar for each process. Beyond that I don't have an example.

Kaldolol commented 7 years ago

Example to retrieve the CPU % each second

int pid = 123;
SystemInfo si = new SystemInfo();
OperatingSystem os = si.getOperatingSystem();
CentralProcessor processor = si.getHardware().getProcessor();
int cpuNumber = processor.getLogicalProcessorCount();
boolean processExists = true;
while (processExists) {
    process = os.getProcess(pid);
    if (process != null) {
        // CPU
        currentTime = process.getKernelTime() + process.getUserTime();

        if (previousTime != -1) {
            // If we have both a previous and a current time
            // we can calculate the CPU usage
            timeDifference = currentTime - previousTime;
            cpu = (100d * (timeDifference / ((double) 1000))) / cpuNumber;
        }

        previousTime = currentTime;

        Thread.sleep(1000);
    } else {
        processExists = false;
    }
}

sleepypikachu commented 7 years ago

@Kaldolol this is actually not quite right, this will give you the %age of time that the process was utilising the CPU for but to get the CPU % as commonly understood (e.g. in the Task Manager on windows or top on *nix) you'd need to multiply this by the total utilisation on the system.

And of course you'd get more accurate timings if you measured the time when you took the currentTime measurements, Thread.sleep doesn't gurantee exactly 1s of sleep.

Kaldolol commented 7 years ago

@sleepypikachu Are you sure about that ? I get pretty accurate values with the aforementioned code. Could you please provide an example that I could compare it with mine ?

You're absolutely right about that, it was already told to me by @dbwiddis

sleepypikachu commented 7 years ago

I've been thinking about it some more composing my reply and now I see that I'm mistaken.

My assumption was that if you took the aggregate load and took the fraction you were awake over then that must be what you're responsible for but that's obviously false, if you're running on a single core half the time in a two core system and another process is running on a second core flat out you're responsible for 25% not 37.5%.

Can user_time + kernel_time be greater than real time if you're running on >1 core flat out? If so then your answer is correct.

sohailravian commented 6 years ago

@Kaldolol @dbwiddis these process.getKernelTime() + process.getUserTime() are always giving same values in windows. So previous and current time difference always is 0.

dbwiddis commented 6 years ago

Are you getting a new copy of the process object?

In the current version of OSHI we have not yet implemented "update" or "refresh" methods on most objects; you would need to fetch a new copy of the process list.

sohailravian commented 6 years ago

@dbwiddis SystemInfo systemInfo = new SystemInfo(); OSProcess process = systemInfo.getOperatingSystem().getProcess(processId); I am getting specific process with process id. I am running a scheduler after every 5 seconds.

dbwiddis commented 6 years ago

Which operating system are you using? Can you please file this as a new issue so we can track it? This issue is closed so other than email alerts I'm not seeing your notes.

sohailravian commented 6 years ago

@dbwiddis ok

sohailravian commented 5 years ago

@dbwiddis just one quick question if i am running two thread do i need to have two instances of SystemInfo object or one is enough.As i can see some threads are blocking if i run oshi in multi-threaded env.

dbwiddis commented 5 years ago

There's no value to creating a second systeminfo object. Threads that block will probably do so due to static classes or outside resources. Now if you had an entirely separate instance of the JVM running something might work differently. No guarantees of thread safety tho.

val235 commented 5 years ago

fyi for anyone reading this thread,

there seems to be a difference about how this time sampling is reported between windows/linux

on windows the time is reported across all cores while on linux the number seems to be closer to a per-core value

I had to change this calculation to make the division conditional on the os name

cpu = (100d * (timeDifference / ((double) 1000))) / cpuNumber;

--> cpu = (100d * (timeDifference / ((double) 1000))) / (os.getFamily().equalsIgnoreCase("windows")?cpuNumber:1 )

That's obviously not generic enough to handle all os variations but it was enough for my purposes

dbwiddis commented 5 years ago

A better conditional than text matching os.getFamily() is to use SystemInfo.getCurrentPlatformEnum().

That said, can you post a new issue about this tick mismatch? I'd like to get to the bottom of it and be consistent across OS's if that's possible to do (although if Linux and Windows fundamentally report ticks differently that might be a challenge.)

dbwiddis commented 5 years ago

I looked into this and can't see anything unusual in the code. For Windows, the value is pulled from the WTS_PROCESS_INFO_EX structure:

UserTime The amount of time, in milliseconds, the process has been running in user mode. KernelTime The amount of time, in milliseconds, the process has been running in kernel mode.

For Linux the value is pulled from /proc/pid/stat:

(14) utime %lu Amount of time that this process has been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)). (15) stime %lu Amount of time that this process has been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)).

Neither one references the number of processors (should be number of physical processors) at all. Are you using the latest version (there was a bug recently fixed calculating USER_HZ which now directly pulls from the above referenced SC_CLK_TCK value)?

val235 commented 5 years ago

using the mavenized 'com.github.oshi:oshi-core:3.12.1'

so far i've tried 3 systems

windows (windows 10 64b)
linux (virtual hosted on aws - GNU/Linux - CentOS)
mac (macOS - 10.14.2)

on windows i need to divide by the # of cpus, but not on the other two systems

and I confirmed that the values match the system reported ones fairly closely, by watching 'Task Manager', 'top', etc..

CentralProcessor processor = si.getHardware().getProcessor();
int cpuNumber = processor.getLogicalProcessorCount();
OperatingSystem os = si.getOperatingSystem();

OSProcess process = os.getProcess(123);
long previousTime = process.getKernelTime() + process.getUserTime();
Thread.sleep(1000);

process = os.getProcess(123);
long currentTime = process.getKernelTime() + process.getUserTime();

long timeDifference = currentTime - previousTime;
double cpu = (100d * (timeDifference / ((double) 1000)))  / (os.getFamily().equalsIgnoreCase("windows")?cpuNumber:1 );

^is that the right # of processors to use for the divisor

dbwiddis commented 5 years ago

Aha, herein lies the issue:

I confirmed that the values match the system reported ones fairly closely, by watching 'Task Manager', 'top', etc..

The Windows Task Manager reports the overall utilization of all cores, so its returned result never exceeds 100%. Try running a single-threaded program at full CPU and you'll see it max out at 25% (for a 4-core system) or 50% (for a 2-core system).

In contrast, Linux systems and the top command can actually exceed 100% CPU. From man top:

%CPU -- CPU Usage : The percentage of your CPU that is being used by the process. By default, top displays this as a percentage of a single CPU. On multi-core systems, you can have percentages that are greater than 100%. For example, if 3 cores are at 60% use, top will show a CPU use of 180%. See here for more information. You can toggle this behavior by hitting Shift-I while top is running to show the overall percentage of available CPUs in use.

is that the right # of processors to use for the divisor

"It depends." Generally, no: you should use the number of cores (physical processors) to represent the true system load. If you look at per-processor load for my macbook (4 physical processors, 8 logical processors) at full load you'll see something like:

24% 1% 23% 2% 25% 0% 20% 5%.

However, this can have a more complex interpretation if your system is running on a hypervisor where total physical processors of the virtualized systems may exceed the actual physical number of processors.

So, in summary:

OSHI is faithfully reporting the amount of CPU time (ticks) consumed by each process.
In the case of multi-threaded processes, it is possible for process CPU time consumed to exceed elapsed time
The "denominator" of your CPU usage calculation, if you want overall system usage, should include the 'total CPU time available' for all operating systems, multiplying the elapsed time by the number of physical CPUs. However, single threaded programs my max out at a smaller percentage, and it may be more useful from a monitoring perspective to detect an endless loop, etc., if a process is consuming close to (or over) 100% of a single processor, in which case you don't multiply.
To match Windows Task Manager, or 'htop' output or 'top' with "thread mode" or "Solaris mode" you can divide by the total number of cores (physical processor count) to get a better representation of "total system load" which should never exceed 100%. This is what you've done for Windows -- however, the same physical interpretation applies to any system, it's just that top doesn't do the division by default (but htop does).

val235 commented 5 years ago

ok perhaps the problem here is more with how to interpret the numbers

if I don't use the 'denominator' on windows its often hard for me to interpret the results

on my machine 4 physical / 8 logical cores the numbers often spike above 800%, and that's the part that is hard for me to reconcile, on unix that number was always bound by the number of cores ie it never spiked above 400% on a 4 core system

dbwiddis commented 5 years ago

on my machine 4 physical / 8 logical cores the numbers often spike above 800%

This is very odd. Can you post the individual values you're seeing in your test case (before diving by the elapsed time/# of processors) and run it for several iterations (30 or 60 seconds)?

Also can you confirm this is a real, physical machine and not a hypervisor-based VM?

Some thoughts on possible inaccuracies (none of which should be that order of magnitude):

You should divide by actual elapsed time in milliseconds, not 1000. The "sleep" goes for 1000 ms but you've probably got another ~15 to ~20 ms of processing time to fetch the ticks.
There's an issue of tick resolution. Tick counters are in 100-ns units but increment in large chunks based on the clock speed. On my current system my 100-ns counters increment by multiples of 156,250 (that's 15.625 milliseconds). Shouldn't impact a 1000-ms average too much but if you're measuring at a 30-ms resolution it could be huge.
I think I may see a possible bug where I'm treating the WTS_PROCESS_INFO_EX Kernel and User times as 100-ns units when the documentation says they're milliseconds -- however, that error would take it in the opposite direction, and I'm pretty sure I compared my actual results here and the docs are wrong.

dbwiddis commented 5 years ago

Testing on my system, I can confirm the WTS_PROCESS_INFO_EX are 100-ns counters (docs are wrong) but both counters increment in 156,250 100-ns chunks (system dependent value, yours is probably different). After the math conversion that makes a 15ms or 16ms resolution for each of the two counters. It's therefore possible for a tiny sliver of CPU time to increment the sum by 32 ms.

This should be a max 3% error for 1000-ms polling interval though.

dbwiddis commented 5 years ago

More testing. That 16ms resolution is per tick type per process per logical processor. So for an 8-processor system each reading can increment in steps as large as 256 ms. It would be possible for a 256-ms actual usage to report between 0 and 512 ms per process. If you added up the sum of multiple processes the error could be bigger.

Larger tick intervals should mitigate/smooth this.

val235 commented 5 years ago

Sorry, had to switch gears to something else for a while but had some time this weekend to try out some suggestions . Just providing more info here, this works good-enough for my purposes.

So I decided to keep track of the actual nano-time as opposed to assuming that the thread really slept for 1000 ms.

Here is my code that sample top-20-by-cpu processes and sums them up to get total approximate system cpu load

  SystemInfo si =  new SystemInfo();
  OperatingSystem os = si.getOperatingSystem();

  Map<Integer, Object[]> prevCpuTime = Arrays.asList(os.getProcesses(20, ProcessSort.CPU)).stream()
                .map(p->p)
                .collect(Collectors.toMap(
                                        p->p.getProcessID(), 
                    p->new Object[] {p.getKernelTime() + p.getUserTime(), System.nanoTime()}, 
                                        (a,b)->a));  

try {
  Thread.sleep(1000);
}catch (Exception e) {}

 Map<Integer, Object[]> currentProcesses = Arrays.asList(os.getProcesses(20, ProcessSort.CPU)).stream()
             .map(p->p)
             .filter(p->!"Idle".equals(p.getName())) //just removing the windows 'idle' process
             .collect(Collectors.toMap(p->p.getProcessID(), 
                            p->new Object[] {p, System.nanoTime()}, 
                            (a,b)->a));

Map<OSProcess, Double> pct = currentProcesses.entrySet().stream()
       .filter(e->prevCpuTime.containsKey(e.getKey())) //just a filter to make sure that this key exists in the previous sample
       .collect(Collectors.toMap(
           e->(OSProcess)e.getValue()[0],
           e->{

             OSProcess newProcess = (OSProcess) e.getValue()[0];

             Long newNano = (Long) e.getValue()[1];
             Long prevNano =  (Long) prevCpuTime.get(newProcess.getProcessID())[1]; 

                 long previousTime = (long) prevCpuTime.get(e.getKey())[0];
                     long currentTime = newProcess.getKernelTime() + newProcess.getUserTime();

            double ellapsed = (newNano-prevNano)/1000000d;
                    long timeDifference = currentTime - previousTime;
            double cpu = 100d * timeDifference / ellapsed;

             return cpu;
           },
           (a,b)->a
         ));

System.out.println("TOTAL : "+pct.values().stream().mapToDouble(p->p).sum());

pct.entrySet().stream()
   .sorted(Comparator.comparing(e->e.getValue()))
   .forEach(e->System.out.println(e.getKey().getName()+"["+e.getKey().getProcessID()+"] : "+e.getValue()+"%"));

Under heavy load it still spikes higher than 800%, 900 or so regularly but sometimes upto 1500+

Heavy load here means, running 2 node solr, java + tomcat, eclipse etc, all on machine, so the cpu is pretty pegged.

this is fine for my needs, as I'm really only interested in a rough inspection, ie light load, medium load or heavy load, but im still curious about what I might be doing wrong to get those spikes.

FYI: in my case its the combination of java processes, 2 solr + 1 tomcat instance that are responsible for the bulk of that spike.

val235 commented 5 years ago

Here is an example of total 2500% spike, you can see the 2 solr and tomcat processes, as well as a pair of some sort of windows system things

dbwiddis commented 5 years ago

These snapshot examples don't tell the full story. I've pointed out that Windows timer resolution is about 15.6 ms (1000/64) by default. For elapsed time OSHI sums up User and Kernel time, which themselves sum from ticks from each processor. It's entirely possible that these values are summed up from individual threads under the process (likely for Tomcat and Solr which maintain multiple threads.) It may be that a 1000-ms polling interval is simply too short, and if you average the ticks over a longer period of time you'll get more sane results. I suspect these large spikes are also followed by too-low reports, under a relatively constant load. Can you show me a case where more than 40,000 ms of Kernel+User time elapse in a 10 second period?

The bottom line here is:

I believe OSHI is faithfully reporting the elapsed time tick timers (converted to milliseconds) for each process, as reported by the OS.
- If there is a bug, this needs to be shown to be in error which requires more than a snapshot of one-second change.
For a multithreaded process, these counters can (and do) exceed 100% for processes
- The factor may be up to the number of logical processors since the OS can't tell that hyperthreading is occurring, and may be "reporting double" if two threads are on the same physical processor
- Counters increment in 15 or 16 ms chunks so a longer period should average out these step increases.
- There are separate counters for User and Kernel time which could also contribute to the "step increase" issue.
- There may internally be separate accounting for individual program threads, so programs which maintain many threads may also contribute to this "step increase" problem.

val235 commented 5 years ago

Tried as you suggested with a 10sec sampling interval (Thread.sleep(10000)) and the numbers definitely smoothed out. The heaviest single process hovered around 400%, with a few spikes slightly above that, but probably within the error bars, and the sum total hovered around 800% with a few spikes as high as ~880% but again probably within the error bars given its a sum.

So ya the spike are likely some kind of function of too-short-sample-period combined with hyperthreading

dbwiddis commented 5 years ago

Cool... so I think we're settled on OSHI being correct. The only question is (still) what to divide the sum of times by to get the same % as Task Manager. Initially I said it should be PhysicalProcessors because that's really all that's available to process on. But the OS doesn't know what's physical and what's logical and I think it might count elapsed time on two threads on a single hyperthreaded physical processor, depending on when it gets those CPU slices. So Logical Processors is probably the best.

ShangWangD commented 5 years ago

You'll have to calculate it from the "ticks" available much as you would count the overall CPU usage of the processor (although there are convenience methods there!) The OSProcess object provides you active time on the process in getKernelTime() and getUserTime(). You can add these together and divide by getUpTime() to get CPU average since the process started. An example of this calculation is in the OSProcess Comparator for CPU usage. If you want a "recent" CPU usage you'll have to record those times at a time interval, then grab them a second time and do the same calculation on the time difference.

I did the job following what u said.but the cpu percentage I got is not the data from Resource manager.the same process ,the usagei calcuate is 10%,but the usage in Resource manager is 45%

ShangWangD commented 5 years ago

Cool... so I think we're settled on OSHI being correct. The only question is (still) what to divide the sum of times by to get the same % as Task Manager. Initially I said it should be PhysicalProcessors because that's really all that's available to process on. But the OS doesn't know what's physical and what's logical and I think it might count elapsed time on two threads on a single hyperthreaded physical processor, depending on when it gets those CPU slices. So Logical Processors is probably the best.

Cool... so I think we're settled on OSHI being correct. The only question is (still) what to divide the sum of times by to get the same % as Task Manager. Initially I said it should be PhysicalProcessors because that's really all that's available to process on. But the OS doesn't know what's physical and what's logical and I think it might count elapsed time on two threads on a single hyperthreaded physical processor, depending on when it gets those CPU slices. So Logical Processors is probably the best.

firstly,I can't find SystemInfo

oshi / oshi

Get CPU usage per process #359