htop-dev / htop

htop - an interactive process viewer
https://htop.dev/
GNU General Public License v2.0
6.35k stars 429 forks source link

htop too slow on large server #1535

Open NikolaBorisov opened 2 weeks ago

NikolaBorisov commented 2 weeks ago

When I try using htop on a server with around 200 cores and 2TB of memory (H100x8 server) htop is super super slow. It shows black screen for about 60s before showing anything on the screen. I tried the 3.0.3 version as well as the latest master from source. I tried disabling the patch from this issue: #1484 and it didn't help.

Server has 383K threads which is a lot ;)

Is there anything I can do to debug the slowness?

BenBE commented 2 weeks ago

Can you press Shift+H (Toggle userland thread display) or Shift+K (Toggle kernel thread display) and check if this helps?

Also disabling the "library size" column, mentioned in #1484 may help (basically, everything that avoids reading the maps file). Also disabling the check for outdated binaries should help.

Finally, with that many processes in place, I'm not quite sure how the tree sorting performs. You can toggle this by pressing T if necessary.

If these steps don't help it would be nice to take a look at a flame graph using perf or some other tool like callgrind.

NikolaBorisov commented 2 weeks ago

Shift+H helped the most. The refresh rate became close 6 sec. It is still very laggy and not nice to use especially because you have to wait long time for it to start. I could not find the "library size" column I disabled the outdated binaries feature. I was not using tree sort by default enabling it did not change much. htop now uses around 50+% cpu from one core. But still quite laggy. I captured perf but it looks like it has no debug symbols.

perf.data.zip]

Anything else I can try? How can I start in the Shift+H mode?

BenBE commented 2 weeks ago

Without debug symbols the perf data is kinda hard to work with. Could you try to compile htop from source?

$ ./autogen.sh
$ ./configure CFLAGS="-Og -g"
$ make

You can then start ./htop from the build directory. More details on the build can be found in the readme.

But based on your reported success with Shift+H this hints at the sheer number of threads (and processes) taking some time to be read. Once reading userland threads is disabled, you can just close htop with q and it's saved to the settings. The long delay at the start should usually be better there too.

That initial blank screen comes from htop needing to refresh the process list twice.