htop-dev / htop

htop - an interactive process viewer
https://htop.dev/
GNU General Public License v2.0
6.53k stars 440 forks source link

Consider adding option to hide irrelevant CPU and RAM stats inside container #1538

Open jjyyxx opened 2 months ago

jjyyxx commented 2 months ago

Inside containers hosted on a server with a lot of cores and RAM, with the actual core and RAM limit constrained via cgroup settings (cpuset and memory), the htop's CPU and RAM meters are quite unhelpful, in the sense that it could not reflect the container's actual situation (and sometimes even annoying when the terminal size is small, leaving no space for processes). In my case, the server has 2x Intel(R) Xeon(R) Platinum 8480+ (224 threads) and 2TiB memory, but the container is limited to 16 threads and 32GiB memory.

This two constraints can be queried via

Ideally, htop could provide container-aware option such that only the core utilization within the cpuset, and the actual memory limit (and usage) is displayed.

fasterit commented 2 months ago

What containerization solution are you using?

jjyyxx commented 2 months ago

The containers are allocated with web ui of certain proprietary cluster management system, but the underlying containers are most likely managed by docker with NVIDIA Container Toolkit. Cgroup v1 is used, and I could check /sys/fs/cgroup/cpuset/cpuset.cpus (showing 208-223) and /sys/fs/cgroup/memory/memory.limit_in_bytes (showing 34359738368).

fasterit commented 2 months ago

The cgroup stuff seems to be quite messy still cf. https://github.com/kubernetes/kubernetes/issues/119669 . What does cat /proc/self/cgroup say in your case?

jjyyxx commented 2 months ago

It shows

12:devices:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
11:memory:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
10:rdma:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
9:pids:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
8:hugetlb:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
7:perf_event:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
6:cpuset:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
5:blkio:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
4:freezer:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
3:net_cls,net_prio:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
2:cpu,cpuacct:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
1:name=systemd:/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope
0::/system.slice/docker-c18d38b354eab53da0111aa259a7b247b29f261eb6cfef946f7653ba18453271.scope

Or, if universal support is difficult at the moment, is it practical to add two customizable meters:

  1. Show a subset of CPU cores as configured, or better, read from a file;
  2. Show memory usage and limit from one or two user-configured files periodically?