networkservicemesh / cmd-nsmgr

Network Service Manager
Apache License 2.0
5 stars 21 forks source link

What should be the real memory consumed by nsmgr process? #679

Open ljkiraly opened 4 months ago

ljkiraly commented 4 months ago

This is a side question which popped up when investigating the issue #675 . It is regarding the 'real' memory consumption of nsmgr container and how this relates to the memory usage of it's single process.
When I checked the memory usage of nsmgr process different values were showed by different tools.

I did not relayed to the 'ps' command output, but checked the 'smaps' in /proc directory (same as pmap command). Another source was the go's profiling (pprof) tool which shows much lower memory used by nsmgr process.

> ps -o pid -o rss -o size -p 2753077
   PID   RSS  SIZE
2753077 28672 58940

Pmap and smaps shown a 29524 kilobytes

The pprof tool gives the following output:

 > go tool pprof -top memprofile-15:20:53-4287106497 | head -4
File: nsmgr
Type: inuse_space
Time: May 7, 2024 at 5:20pm (CEST)
Showing nodes accounting for 2581.36kB, 100% of 2581.36kB total

There are different values reported for RSS by pmap (29524 kilobytes), by 'ps' (28904 kilobytes) and by pprof (2581.36kB). The only process running on this container is the nsmgr process. The grpc probes are removed on this deployment. The kubectl top reported an 16 Mi bytes which almost the same as systemct and memory.current file shows:

systemctl status 2753077
● cri-containerd-fec3f5061630291f4345bfffa05cd5dc08e5542b98713a45d1d6de2afedfda66.scope - libcontainer container fec3f5061630291f4345bfffa05cd5dc08e5542b98713a45d1d6de2afedfda66
     Loaded: loaded (/run/systemd/transient/cri-containerd-fec3f5061630291f4345bfffa05cd5dc08e5542b98713a45d1d6de2afedfda66.scope; transient)
  Transient: yes
    Drop-In: /run/systemd/transient/cri-containerd-fec3f5061630291f4345bfffa05cd5dc08e5542b98713a45d1d6de2afedfda66.scope.d
             └─50-DevicePolicy.conf, 50-DeviceAllow.conf, 50-MemoryMax.conf, 50-CPUWeight.conf, 50-CPUQuotaPeriodSec.conf, 50-CPUQuota.conf
     Active: active (running) since Mon 2024-05-06 16:04:23 UTC; 23h ago
         IO: 0B read, 2.5M written
      Tasks: 21 (limit: 11475)
     Memory: 15.4M (max: 200.0M available: 184.2M)
        CPU: 9min 14.275s
     CGroup: /kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-pod7dec5ba0_a449_4165_b315_cc5181a4f45b.slice/cri-containerd-fec3f5061630291f4345bfffa05cd5dc08e5542b98713a45d1d6de2afedfda66.scope
             └─2753077 /bin/nsmgr

May 06 16:04:23 kind-worker4 systemd[1]: Started libcontainer container fec3f5061630291f4345bfffa05cd5dc08e5542b98713a45d1d6de2afedfda66.

> cat /sys/fs/cgroup/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-pod7dec5ba0_a449_4165_b315_cc5181a4f45b.slice/cri-containerd-fec3f5061630291f4345bfffa05cd5dc08e5542b98713a45d1d6de2afedfda66.scope/memory.current
16228352

> cat /sys/fs/cgroup/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-pod7dec5ba0_a449_4165_b315_cc5181a4f45b.slice/cri-containerd-fec3f5061630291f4345bfffa05cd5dc08e5542b98713a45d1d6de2afedfda66.scope/cgroup.procs
2753077

Later the RSS values showed by ps and pmap command showing much lower memory consumption than cgroup's memory.current and 'kubectl top'. What should cause the difference? What else can count to container memory usage than nsmgr process (CRI, kubelet)?

denis-tingaikin commented 4 months ago

I will not be surprised if it is related to https://github.com/edwarnicke/grpcfd/issues/25