google / cadvisor

Analyzes resource usage and performance characteristics of running containers.
Other
16.97k stars 2.31k forks source link

ImageFS usage is not reported correct when using CRI with Devicemapper as a device driver #2883

Open vnandha opened 3 years ago

vnandha commented 3 years ago

Environment details

Kubernetes 1.17.14 Containerd: 1.4.6 OS: RHEL7.9

Devicemapper is setup based on container-storage-setup as mentioned in https://github.com/containerd/containerd/blob/master/snapshots/devmapper/README.md

$ sudo lvs
  LV          VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  docker-pool sys twi-aot--- <221.45g             27.71  5.39
  home        sys -wi-ao----   10.00g
  root        sys -wi-ao---- <232.38g
$ crictl imagefsinfo
{
  "status": {
    "timestamp": "1622581147654256357",
    "fsId": {
      "mountpoint": "/var/lib/containerd/io.containerd.snapshotter.v1.devmapper"
    },
    "usedBytes": {
      "value": "63163072512"
    },
    "inodesUsed": {
      "value": "0"
    }
  }
}

It appears to be the kubelet/ cadvisor is using fsId and trying to look up the mount point to find out the imageFS details. And eventually, end up being the same mount as the node fs.

$ curl -s --cacert /etc/pki/tls/certs/ca.pem --cert /etc/pki/tls/certs/kube-admin.pem --key /etc/pki/tls/private/kube-admin-key.pem https://kubenode1:10250/stats/summary/ |grep '"fs":' -A 20
  "fs": {
   "time": "2021-06-01T20:51:30Z",
   "availableBytes": 227585343488,
   "capacityBytes": 249462521856,
   "usedBytes": 21877178368,
   "inodesFree": 121693064,
   "inodes": 121833472,
   "inodesUsed": 140408
  },
  "runtime": {
   "imageFs": {
    "time": "2021-06-01T20:51:27Z",
    "availableBytes": 227585343488,
    "capacityBytes": 249462521856,
    "usedBytes": 63162023936,
    "inodesFree": 121693064,
    "inodes": 121833472,
    "inodesUsed": 0
   }
  },
  "rlimit": {

Issue

This causes the kubelet's eviction manager not working for imagefs, result in imagefs Full. We have to manually delete the unused images using crictl rmi --prune

We have the same devicemapper with Dockerd as CRI and imageFS is being reported correctly with that. It seems there is a cadvisor code to detect devicemapper volume usage info with Dockerd. With CRI that code path is not used.

We tested this with CRI-O as well, same issue noted. The imagefsinfo is same as nodefs for cri-o and containerd runtimes.

vnandha commented 3 years ago

If the env is CRI, then the imagefsinfo fsId.mountpoint is used, The current cadvisor detects docker devicemapper info and adds to detected partition table, may be similar thing needed for cri + devicemapper? ( the one hint here is the fsId.Mountpoint contains devicemapper /var/lib/containerd/io.containerd.snapshotter.v1.devmapper )

I just tested on a node with cri+overlayfs, both nodefs and imagefs are reported without any issue. ( I used dedicated partition/lvm and mounted under /var/lib/containerd prior to start containerd ), kubelet summary api shows as expected

/var/lib/containerd $ lvs
  LV        VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home      sys -wi-ao----   10.00g
  overlayfs sys -wi-ao---- <222.38g
  root      sys -wi-ao---- <232.38g

/var/lib/containerd $ df -h .
Filesystem                 Size  Used Avail Use% Mounted on
/dev/mapper/sys-overlayfs  223G  1.7G  221G   1% /var/lib/containerd

/var/lib/containerd $ crictl imagefsinfo
{
  "status": {
    "timestamp": "1622594437236082468",
    "fsId": {
      "mountpoint": "/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs"
    },
    "usedBytes": {
      "value": "1738137600"
    },
    "inodesUsed": {
      "value": "24849"
    }
  }
}

$ sudo curl -s --cacert /etc/pki/tls/certs/ca.pem --cert /etc/pki/tls/certs/kube-admin.pem --key /etc/pki/tls/private/kube-admin-key.pem https://kubenode20:10250/stats/summary/ |grep '"fs":' -A 20
  "fs": {
   "time": "2021-06-02T00:38:17Z",
   "availableBytes": 228123242496,
   "capacityBytes": 249462521856,
   "usedBytes": 21339279360,
   "inodesFree": 121704781,
   "inodes": 121833472,
   "inodesUsed": 128691
  },
  "runtime": {
   "imageFs": {
    "time": "2021-06-02T00:38:17Z",
    "availableBytes": 236875235328,
    "capacityBytes": 238660943872,
    "usedBytes": 1738137600,
    "inodesFree": 116565592,
    "inodes": 116590592,
    "inodesUsed": 24849
   }
  },
  "rlimit": {

with cri+devicemapper, there is no separate mount point like above as the block device is used for image storage.

vnandha commented 3 years ago

We added DEBUG statement, to find how it is working in the dockerd+devicemapper env, it turns out that the cadvisor is labelling the image fs with docker-images

I0208 22:25:24.245691    6403 cadvisor_stats_provider.go:230] [DEBUG] inside ImageFsStats for cadvisor
I0208 22:25:24.245694    6403 cadvisor_linux.go:161] [DEBUG] label from ImageFsInfo "docker-images"
I0208 22:25:24.245696    6403 manager.go:732] [DEBUG] LABEL from GetFsInfo docker-images
I0208 22:25:24.245698    6403 manager.go:733] [DEBUG] LABEL LENGTH from GetFsInfo %!s(int=13)
I0208 22:25:24.245702    6403 manager.go:737] [DEBUG] DEV for LABEL from GetFsInfo "sys-docker--pool"
I0208 22:25:24.245707    6403 manager.go:749] [DEBUG] mountpoint for fs.Device is ""
I0208 22:25:24.245709    6403 manager.go:754] [DEBUG] labels for fs.Device is []string{"docker-images"}
I0208 22:25:24.245718    6403 manager.go:774] [DEBUG] fsInfo is []v2.FsInfo{v2.FsInfo{Timestamp:time.Time{wall:0xc0008c750c778087, ext:185285290911, loc:(*time.Location)(0x7082040)}, Device:"sys-docker--pool", Mountpoint:"", Capacity:0xc68e000000, Available:0xba1f200000, Usage:0xc6ee00000, Labels:[]string{"docker-images"}, Inodes:(*uint64)(nil), InodesFree:(*uint64)(nil)}}
I0208 22:25:24.245724    6403 cadvisor_linux.go:174] [DEBUG] res from GetFsInfo is []v2.FsInfo{v2.FsInfo{Timestamp:time.Time{wall:0xc0008c750c778087, ext:185285290911, loc:(*time.Location)(0x7082040)}, Device:"sys-docker--pool", Mountpoint:"", Capacity:0xc68e000000, Available:0xba1f200000, Usage:0xc6ee00000, Labels:[]string{"docker-images"}, Inodes:(*uint64)(nil), InodesFree:(*uint64)(nil)}}
I0208 22:25:24.245729    6403 cadvisor_stats_provider.go:232] [DEBUG] imageFsInfo v2.FsInfo{Timestamp:time.Time{wall:0xc0008c750c778087, ext:185285290911, loc:(*time.Location)(0x7082040)}, Device:"sys-docker--pool", Mountpoint:"", Capacity:0xc68e000000, Available:0xba1f200000, Usage:0xc6ee00000, Labels:[]string{"docker-images"}, Inodes:(*uint64)(nil), InodesFree:(*uint64)(nil)}

This code path (look up based on label) is not used with cri+devicemapper