hishamhm / htop

htop is an interactive text-mode process viewer for Unix systems. It aims to be a better 'top'.
GNU General Public License v2.0
5.84k stars 581 forks source link

CPU% vs threads #642

Open OlafvdSpek opened 7 years ago

OlafvdSpek commented 7 years ago

The CPU% for the main thread is 217%. So far as good, but the CPU% for the other threads doesn't seem to add up to 217%, it's not even 117%. How is this calculated? And how would I isolate the CPU% of the main thread?

I've seen confusion over thread CPU% before, maybe it's a good one for the FAQ?

 1650 mysql      20   0 6998M  332M 18140 S 217.  0.5 21:19.63 ├─ /usr/sbin/mysqld
20212 mysql      20   0 6998M  332M 18140 R  2.6  0.5  0:00.05 │  ├─ /usr/sbin/mysqld 
20204 mysql      20   0 6998M  332M 18140 S  2.6  0.5  0:00.04 │  ├─ /usr/sbin/mysqld 
20194 mysql      20   0 6998M  332M 18140 S  3.3  0.5  0:00.05 │  ├─ /usr/sbin/mysqld 
20193 mysql      20   0 6998M  332M 18140 S  3.3  0.5  0:00.06 │  ├─ /usr/sbin/mysqld 
20162 mysql      20   0 6998M  332M 18140 S  2.6  0.5  0:00.06 │  ├─ /usr/sbin/mysqld 
20147 mysql      20   0 6998M  332M 18140 S  2.6  0.5  0:00.06 │  ├─ /usr/sbin/mysqld 
20115 mysql      20   0 6998M  332M 18140 S  2.6  0.5  0:00.07 │  ├─ /usr/sbin/mysqld 
 2050 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:03.87 │  ├─ /usr/sbin/mysqld 
 2045 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.49 │  ├─ /usr/sbin/mysqld 
 2021 mysql      20   0 6998M  332M 18140 S  0.0  0.5  1:29.37 │  ├─ /usr/sbin/mysqld 
 1983 mysql      20   0 6998M  332M 18140 S  1.3  0.5  0:39.00 │  ├─ /usr/sbin/mysqld 
 1937 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.00 │  ├─ /usr/sbin/mysqld 
 1936 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.00 │  ├─ /usr/sbin/mysqld 
 1930 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.00 │  ├─ /usr/sbin/mysqld 
 1929 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.03 │  ├─ /usr/sbin/mysqld 
 1928 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.01 │  ├─ /usr/sbin/mysqld 
 1927 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.08 │  ├─ /usr/sbin/mysqld 
 1919 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.00 │  ├─ /usr/sbin/mysqld 
 1918 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.00 │  ├─ /usr/sbin/mysqld 
 1917 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.00 │  ├─ /usr/sbin/mysqld 
 1916 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.03 │  ├─ /usr/sbin/mysqld 
 1915 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.30 │  ├─ /usr/sbin/mysqld 
 1914 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.03 │  ├─ /usr/sbin/mysqld 
 1913 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.56 │  ├─ /usr/sbin/mysqld 
 1912 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.55 │  ├─ /usr/sbin/mysqld 
 1835 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.23 │  ├─ /usr/sbin/mysqld 
 1834 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.23 │  ├─ /usr/sbin/mysqld 
 1833 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.27 │  ├─ /usr/sbin/mysqld 
 1832 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.25 │  ├─ /usr/sbin/mysqld 
 1831 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.26 │  ├─ /usr/sbin/mysqld 
 1830 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.29 │  ├─ /usr/sbin/mysqld 
 1829 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.26 │  ├─ /usr/sbin/mysqld 
 1828 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.30 │  ├─ /usr/sbin/mysqld 
 1827 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.29 │  ├─ /usr/sbin/mysqld 
 1826 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.29 │  ├─ /usr/sbin/mysqld 
 1825 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.26 │  ├─ /usr/sbin/mysqld 
 1805 mysql      20   0 6998M  332M 18140 S  0.0  0.5  0:00.00 │  └─ /usr/sbin/mysqld 
Jesin commented 7 years ago

Are you sure that's an exhaustive list of all threads in a single process? (It may help to press t for tree view.)

OlafvdSpek commented 7 years ago

│ ├─ is the tree ;)

OlafvdSpek commented 7 years ago
 1650 mysql      20   0 10.5G  429M 19088 S 506.  0.7  4h08:47 ├─ /usr/sbin/mysqld
17724 mysql      20   0 10.5G  429M 19088 S  1.2  0.7 17:25.20 │  ├─ /usr/sbin/mysqld 
10525 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  2:42.63 │  ├─ /usr/sbin/mysqld 
 2045 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:20.79 │  ├─ /usr/sbin/mysqld 
 2021 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  1h13:52 │  ├─ /usr/sbin/mysqld 
 1937 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:00.00 │  ├─ /usr/sbin/mysqld 
 1936 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:00.00 │  ├─ /usr/sbin/mysqld 
 1930 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:00.00 │  ├─ /usr/sbin/mysqld 
 1929 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:02.22 │  ├─ /usr/sbin/mysqld 
 1928 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:01.03 │  ├─ /usr/sbin/mysqld 
 1927 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:00.08 │  ├─ /usr/sbin/mysqld 
 1919 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:00.22 │  ├─ /usr/sbin/mysqld 
 1918 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:00.24 │  ├─ /usr/sbin/mysqld 
 1917 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:00.38 │  ├─ /usr/sbin/mysqld 
 1916 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:01.93 │  ├─ /usr/sbin/mysqld 
 1915 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:17.21 │  ├─ /usr/sbin/mysqld 
 1914 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:01.87 │  ├─ /usr/sbin/mysqld 
 1913 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:17.64 │  ├─ /usr/sbin/mysqld 
 1912 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:16.13 │  ├─ /usr/sbin/mysqld 
 1835 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:14.35 │  ├─ /usr/sbin/mysqld 
 1834 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:13.17 │  ├─ /usr/sbin/mysqld 
 1833 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:14.18 │  ├─ /usr/sbin/mysqld 
 1832 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:13.28 │  ├─ /usr/sbin/mysqld 
 1831 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:13.19 │  ├─ /usr/sbin/mysqld 
 1830 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:13.87 │  ├─ /usr/sbin/mysqld 
 1829 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:13.64 │  ├─ /usr/sbin/mysqld 
 1828 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:14.73 │  ├─ /usr/sbin/mysqld 
 1827 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:15.33 │  ├─ /usr/sbin/mysqld 
 1826 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:17.21 │  ├─ /usr/sbin/mysqld 
 1825 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:14.85 │  ├─ /usr/sbin/mysqld 
 1805 mysql      20   0 10.5G  429M 19088 S  0.0  0.7  0:00.00 │  └─ /usr/sbin/mysqld 
 1634 nobody     20   0 1443M 21800  5816 S  0.0  0.0  1:38.48 ├─ /opt/digitalocean/bin/do-agent -log_syslog
hishamhm commented 7 years ago

@OlafvdSpek the root of the tree is the main thread, and that can consume CPU too.

See this example program:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <pthread.h>

void* child(void* data) {
   int x;
   for(;;) {
      for (int i = 0; i < 1000000; i++) {
         x += rand();
      }
      usleep(10000);
   }
}

int main(int argc, char** argv) {
   srand(time(NULL));
   pthread_t ts[10];
   for (int i = 0; i < 10; i++) {
      pthread_create(&ts[i], NULL, child, NULL);
   }
   if (argc > 1 && strcmp(argv[1], "parentbusy") == 0) {
      child(NULL);
   }
   pthread_join(ts[0], NULL);
   return 0;
}

Compile with gcc -o parentchildren parentchildren.c -lpthread

When I run it with ./parentchildren, I get the main process at ~400%, and each of the 10 child threads at ~40% (400/10).

When I run it with ./parentchildren parentbusy, then the parent is busy too, so I get the main process listed at ~400%, but each of the 10 child threads at ~36% (400/11).

OlafvdSpek commented 7 years ago

the root of the tree is the main thread, and that can consume CPU too.

Sure, but each thread can't consume more than 100%.. so 506% can't be CPU usage of a single thread.

I get the main process at ~400%

You've got a quad core.. or dual core + HT. So the process uses 400%, but that's not the same as the main thread.

So:

  1. There's no way to see the CPU usage of the main thread, as only process usage is displayed there.
  2. In my example above the numbers don't add up to the total.
hishamhm commented 7 years ago

There's no way to see the CPU usage of the main thread, as only process usage is displayed there.

True. I'm afraid /proc doesn't give me this data, so if I were to account the CPU use for the main thread I'd have to do (process - SUM(child_threads))... but that wouldn't be fully correct, which leads us to the real issue, as you correctly point:

In my example above the numbers don't add up to the total.

One possibility is that mysqld is creating many short-lived threads which use CPU but not long enough to display in one screen refresh. I notice from your screen captures is that main process 1650 remains the same, but while many of the child threads are the same (1805-2050), there's always a few extra threads with high PID numbers and very short TIME counts.

My guess is that the mysqld main thread is respawning threads like crazy (threads are spawning and dying before htop has a chance to see them). The kernel still accounts them for the main process.

OlafvdSpek commented 7 years ago

Good point on the thread spawning!

Have you contacted /proc maintainers and requested the proper data to be provided?

Jesin commented 7 years ago

Sure, but each thread can't consume more than 100%.

On my system, I often see CPU-intensive single-threaded programs running at 128% CPU usage. It may be because of the "Intel Turbo Boost" feature of my i5-7200U, but whatever the reason, your statement that each thread is capped at 100% seems to be false..

hishamhm commented 7 years ago

Have you contacted /proc maintainers and requested the proper data to be provided?

The data in /proc is not wrong. The thing is that htop monitors /proc periodically (every second, by default). No matter how fast I check /proc, there will always be a time interval that is too fast for htop to catch. And if I check too fast, htop itself will start consuming a lot of resources of the machine.

OlafvdSpek commented 7 years ago

The data could be kept around X secs after the thread is gone, but yeah.. probably not.

What about being able to differentiate between total process CPU% and main thread CPU%?

OlafvdSpek commented 7 years ago

When I run it with ./parentchildren, I get the main process at ~400%,

I get SIGSEGV, probably because you access argv[1].. ;) Not a problem though.

hishamhm commented 7 years ago

What about being able to differentiate between total process CPU% and main thread CPU%?

That could be done, but it would take some tweaking to the codebase. I'd probably have to display it like this in order to keep both process and thread data available: (ascii mockup ahead)

1234   |--- parent
1234   |    |--- (main thread)
1235   |    |--- child1
1236   |    |--- child2

I wonder if that would be confusing.

I get SIGSEGV, probably because you access argv[1].. ;)

Ah, I tweaked adding the "parentchild" option after I tested it without that feature. Fixed for posterity :)

OlafvdSpek commented 7 years ago

You could also use two CPU% columns, first one for process CPU% (empty for non-main-threads) and second one for thread CPU%. It's a trade off between horizontal and vertical space.