lxc / lxcfs

FUSE filesystem for LXC
https://linuxcontainers.org/lxcfs
Other
1.04k stars 251 forks source link

[problem] /proc/loadavg does not take effect #578

Closed kanghuzai closed 1 year ago

kanghuzai commented 1 year ago

System info

liunx: 3.10.0-957.el7.x86_64
lxcfs: 5.0.2

Kubelet config

node enable cpuManager and nuam

# cat /var/lib/kubelet/cpu_manager_state 
{"policyName":"static","defaultCpuSet":"0-1,3-13,15-23","entries":{"f22d1adb-1dec-4539-bb8f-36ee4becacf9":{"member-admin-app":"2,14"}},"checksum":2269792505}

Pod config

spec:
      volumes:
        - name: proc-cpuinfo
          hostPath:
            path: /var/lib/lxcfs/proc/cpuinfo
            type: ''
        - name: proc-diskstats
          hostPath:
            path: /var/lib/lxcfs/proc/diskstats
            type: ''
        - name: proc-meminfo
          hostPath:
            path: /var/lib/lxcfs/proc/meminfo
            type: ''
        - name: proc-stat
          hostPath:
            path: /var/lib/lxcfs/proc/stat
            type: ''
        - name: proc-swaps
          hostPath:
            path: /var/lib/lxcfs/proc/swaps
            type: ''
        - name: proc-uptime
          hostPath:
            path: /var/lib/lxcfs/proc/uptime
            type: ''
        - name: proc-slabinfo
          hostPath:
            path: /var/lib/lxcfs/proc/slabinfo
            type: ''
        - name: cpu-online
          hostPath:
            path: /var/lib/lxcfs/sys/devices/system/cpu/online
            type: ''
        - name: proc-loadavg
          hostPath:
            path: /var/lib/lxcfs/proc/loadavg
            type: ''
      containers:
        - name: member-admin-app
          resources:
            limits:
              cpu: '2'
              memory: 2Gi
            requests:
              cpu: '2'
              memory: 2Gi
          volumeMounts:
            - name: proc-cpuinfo
              mountPath: /proc/cpuinfo
            - name: proc-diskstats
              mountPath: /proc/diskstats
            - name: proc-meminfo
              mountPath: /proc/meminfo
            - name: proc-stat
              mountPath: /proc/stat
            - name: proc-swaps
              mountPath: /proc/swaps
            - name: proc-uptime
              mountPath: /proc/uptime
            - name: proc-slabinfo
              mountPath: /proc/slabinfo
            - name: cpu-online
              mountPath: /sys/devices/system/cpu/online
            - name: proc-loadavg
              mountPath: /proc/loadavg

Pod exec result

top - 15:43:10 up  6:11,  0 users,  load average: 5.89, 5.39, 6.54
Tasks:   9 total,   1 running,   7 sleeping,   0 stopped,   1 zombie
%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  1.8 us,  1.8 sy,  0.0 ni, 96.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  2097152 total,  1108932 free,   843836 used,   144384 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  1253316 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                         
    1 root      20   0    4400    404    280 S   0.0  0.0   0:00.51 tini                                                                                                            
    8 root      20   0 3206976 728884  24848 S   0.0 34.8   4:46.59 java                                                                                                            
   55 root      20   0       0      0      0 Z   0.0  0.0   0:00.00 sh                                                                                                              
  147 root      20   0 2545556 154868  20096 S   0.0  7.4   0:35.28 java                                                                                                            
  225 root      20   0   15568   2232   1504 S   0.0  0.1   0:00.08 sh                                                                                                              
  243 root      20   0   15568   2180   1456 S   0.0  0.1   0:00.08 sh                                                                                                              
  265 root      20   0   15568   2228   1504 S   0.0  0.1   0:00.02 sh                                                                                                              
  273 root      20   0   15568   2036   1384 S   0.0  0.1   0:00.02 sh                                                                                                              
  280 root      20   0   59656   2136   1508 R   0.0  0.1   0:00.00 top

Node exec result

top - 15:43:11 up 18:21,  2 users,  load average: 5.89, 5.39, 6.54
Tasks: 946 total,   3 running, 925 sleeping,   0 stopped,  18 zombie
%Cpu0  : 28.4 us,  3.8 sy,  0.0 ni, 66.8 id,  0.0 wa,  0.0 hi,  1.0 si,  0.0 st
%Cpu1  : 25.4 us,  3.7 sy,  0.0 ni, 69.2 id,  1.0 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu2  :  3.4 us,  2.0 sy,  0.0 ni, 93.9 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu3  : 23.9 us,  2.7 sy,  0.0 ni, 73.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  : 27.4 us,  3.3 sy,  0.0 ni, 68.6 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu5  : 35.5 us,  6.4 sy,  0.0 ni, 55.5 id,  2.3 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu6  : 42.0 us,  2.7 sy,  0.0 ni, 55.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  : 28.1 us,  6.4 sy,  0.0 ni, 64.5 id,  0.3 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu8  : 33.4 us,  4.3 sy,  0.0 ni, 61.9 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu9  : 25.2 us,  4.0 sy,  0.0 ni, 67.5 id,  3.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu10 : 45.2 us,  2.3 sy,  0.0 ni, 51.5 id,  0.0 wa,  0.0 hi,  1.0 si,  0.0 st
%Cpu11 : 12.8 us,  4.4 sy,  0.0 ni, 80.1 id,  2.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu12 : 19.7 us,  2.0 sy,  0.0 ni, 77.7 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu13 : 22.3 us,  5.3 sy,  0.0 ni, 72.1 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu14 :  2.7 us,  1.7 sy,  0.0 ni, 95.0 id,  0.3 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu15 : 42.9 us,  4.0 sy,  0.0 ni, 52.8 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu16 : 15.0 us,  2.0 sy,  0.0 ni, 82.1 id,  0.0 wa,  0.0 hi,  1.0 si,  0.0 st
%Cpu17 :  4.7 us,  1.3 sy,  0.0 ni, 94.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu18 : 24.3 us,  1.7 sy,  0.0 ni, 73.3 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu19 : 17.7 us,  3.7 sy,  0.0 ni, 77.7 id,  1.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu20 :  6.3 us,  2.3 sy,  0.0 ni, 91.0 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu21 : 10.6 us,  2.6 sy,  0.0 ni, 86.4 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu22 :  8.3 us,  2.3 sy,  0.0 ni, 89.0 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu23 :  5.6 us,  1.7 sy,  0.0 ni, 89.0 id,  3.7 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 13181204+total, 82838352 free, 21958700 used, 27014988 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 10855417+avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                         
45572 root      20   0   11.3g   1.1g  29696 S 154.5  0.9 492:39.29 java                                                                                                            
33816 root      20   0 1245936 160272  13984 R 150.5  0.1   0:08.43 node                                                                                                            
33331 root      20   0 1214856 397824  14572 R 125.7  0.3   0:20.72 npm                                                                                                             
10019 root      20   0   14.8g 759944  23368 S  22.1  0.6 573:56.15 dockerd                                                                                                         
12862 root      20   0   13.6g   1.0g  30876 S  22.1  0.8   6:08.11 java                                                                                                            
21308 root      20   0   13.9g   1.1g  27476 S  20.8  0.9  53:17.68 java                                                                                                            
23050 root      20   0 8209584 834544  26416 S  19.1  0.6   2:51.80 java                                                                                                            
45142 root      20   0 2431868 133352  37636 S  11.2  0.1  25:56.16 kubelet

The docker example is also not the expected result

docker run -it -m 256m --memory-swap 256m --cpuset-cpus "0,2" \
      -v /var/lib/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw \
      -v /var/lib/lxcfs/proc/diskstats:/proc/diskstats:rw \
      -v /var/lib/lxcfs/proc/meminfo:/proc/meminfo:rw \
      -v /var/lib/lxcfs/proc/stat:/proc/stat:rw \
      -v /var/lib/lxcfs/proc/swaps:/proc/swaps:rw \
      -v /var/lib/lxcfs/proc/uptime:/proc/uptime:rw \
      -v /var/lib/lxcfs/proc/slabinfo:/proc/slabinfo:rw \
      -v /var/lib/lxcfs/proc/loadavg:/proc/loadavg:rw \
      -v /var/lib/lxcfs/sys/devices/system/cpu:/sys/devices/system/cpu:rw \
      ubuntu:18.04 /bin/bash
mihalicyn commented 1 year ago

@kanghuzai have you passed --enable-loadavg option to the lxcfs?

kanghuzai commented 1 year ago

--enable-loadavg

Modified here in the /usr/lib/systemd/system/lxcfs.service file,lxcfs.service start failed

ExecStart=/usr/bin/lxcfs --enable-loadavg /var/lib/lxcfs

systemd[1]: Stopping FUSE filesystem for LXC...
Jan 17 09:42:50 k8s-node-01 lxcfs[37160]: Running destructor lxcfs_exit
Jan 17 09:42:50 k8s-node-01 systemd[1]: lxcfs.service: main process exited, code=exited, status=1/FAILURE
Jan 17 09:42:50 k8s-node-01 fusermount[45128]: /bin/fusermount: failed to unmount /var/lib/lxcfs: Invalid argument
Jan 17 09:42:50 k8s-node-01 systemd[1]: Stopped FUSE filesystem for LXC.
Jan 17 09:42:50 k8s-node-01 systemd[1]: Unit lxcfs.service entered failed state.
Jan 17 09:42:50 k8s-node-01 systemd[1]: lxcfs.service failed.
Jan 17 09:42:50 k8s-node-01 systemd[1]: Started FUSE filesystem for LXC.
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: Running constructor lxcfs_init to reload liblxcfs
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: mount namespace: 4
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: hierarchies:
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 0: fd:   5: name=systemd
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 1: fd:   6: freezer
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 2: fd:   7: devices
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 3: fd:   8: net_cls,net_prio
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 4: fd:   9: memory
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 5: fd:  10: pids
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 6: fd:  11: cpu,cpuacct
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 7: fd:  12: blkio
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 8: fd:  13: cpuset
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 9: fd:  14: perf_event
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: 10: fd:  15: hugetlb
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: Kernel supports swap accounting
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: api_extensions:
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - cgroups
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - sys_cpu_online
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - proc_cpuinfo
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - proc_diskstats
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - proc_loadavg
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - proc_meminfo
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - proc_stat
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - proc_swaps
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - proc_uptime
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - proc_slabinfo
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - shared_pidns
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - cpuview_daemon
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - loadavg_daemon
Jan 17 09:42:50 k8s-node-01 lxcfs[45130]: - pidfds

Log of successful startup of lxcfs

Jan 17 09:55:28 k8s-node-01 systemd[1]: Started FUSE filesystem for LXC.
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: Running constructor lxcfs_init to reload liblxcfs
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: mount namespace: 4
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: hierarchies:
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 0: fd:   5: name=systemd
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 1: fd:   6: freezer
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 2: fd:   7: devices
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 3: fd:   8: net_cls,net_prio
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 4: fd:   9: memory
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 5: fd:  10: pids
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 6: fd:  11: cpu,cpuacct
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 7: fd:  12: blkio
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 8: fd:  13: cpuset
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 9: fd:  14: perf_event
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: 10: fd:  15: hugetlb
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: Kernel supports swap accounting
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: api_extensions:
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - cgroups
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - sys_cpu_online
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - proc_cpuinfo
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - proc_diskstats
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - proc_loadavg
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - proc_meminfo
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - proc_stat
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - proc_swaps
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - proc_uptime
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - proc_slabinfo
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - shared_pidns
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - cpuview_daemon
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - loadavg_daemon
Jan 17 09:55:28 k8s-node-01 lxcfs[44795]: - pidfds
kanghuzai commented 1 year ago

@kanghuzai have you passed --enable-loadavg option to the lxcfs?

The function of the latest version 5.0.3 of this parameter, it still doesn't work after I have tested it.

# lxcfs -h
Usage: lxcfs <directory>

lxcfs is a FUSE-based proc, sys and cgroup virtualizing filesystem

Options :
  -d, --debug          Run lxcfs with debugging enabled
  -f, --foreground     Run lxcfs in the foreground
  -n, --help           Print help
  -l, --enable-loadavg Enable loadavg virtualization
  -o                   Options to pass directly through fuse
  -p, --pidfile=FILE   Path to use for storing lxcfs pid
                       Default pidfile is /run/lxcfs.pid
  -u, --disable-swap   Disable swap virtualization
  -v, --version        Print lxcfs version
  --enable-cfs         Enable CPU virtualization via CPU shares
  --enable-pidfd       Use pidfd for process tracking
mihalicyn commented 1 year ago

Jan 17 09:42:50 k8s-node-01 fusermount[45128]: /bin/fusermount: failed to unmount /var/lib/lxcfs: Invalid argument this error happened during unmounting. I don't understand, did you managed to run lxcfs with loadavg enabled successfully or not?

kanghuzai commented 1 year ago

Jan 17 09:42:50 k8s-node-01 fusermount[45128]: /bin/fusermount: failed to unmount /var/lib/lxcfs: Invalid argument this error happened during unmounting. I don't understand, did you managed to run lxcfs with loadavg enabled successfully or not?

kanghuzai commented 1 year ago

Jan 17 09:42:50 k8s-node-01 fusermount[45128]: /bin/fusermount: failed to unmount /var/lib/lxcfs: Invalid argument this error happened during unmounting. I don't understand, did you managed to run lxcfs with loadavg enabled successfully or not?

5.0.2 failed to run, 5.0.3 succeeded, but 5.0.3 did not take effect

mihalicyn commented 1 year ago

An error that you've showed above can't be related to 5.0.2 or 5.0.3. It's just about existing mount on /var/lib/lxcfs. Have you passed --enable-loadavg parameter? If yes, you can add debug prints to this line https://github.com/lxc/lxcfs/blob/cd2e3ac5c5ae4fde5b58380ac2f56e55c78e41cc/src/proc_loadavg.c#L188 to check if we just read host loadavg or not.

kanghuzai commented 1 year ago

An error that you've showed above can't be related to 5.0.2 or 5.0.3. It's just about existing mount on /var/lib/lxcfs. Have you passed --enable-loadavg parameter? If yes, you can add debug prints to this line

https://github.com/lxc/lxcfs/blob/cd2e3ac5c5ae4fde5b58380ac2f56e55c78e41cc/src/proc_loadavg.c#L188

to check if we just read host loadavg or not.

It takes effect when I change to read-only attribute

image

After the installation is successful, manually add the '-- enable loadavg' parameter and set the '/proc/loadavg' file to read-only

systemctl stop lxcfs.service
sed -i 's\/usr/bin/lxcfs\/usr/bin/lxcfs --enable-loadavg\g' /usr/lib/systemd/system/lxcfs.service
systemctl daemon-reload
systemctl start lxcfs.service && systemctl status lxcfs.service