Open vnandha opened 3 years ago
If the env is CRI, then the imagefsinfo fsId.mountpoint
is used, The current cadvisor detects docker devicemapper info and adds to detected partition table, may be similar thing needed for cri + devicemapper? ( the one hint here is the fsId.Mountpoint contains devicemapper /var/lib/containerd/io.containerd.snapshotter.v1.devmapper )
I just tested on a node with cri+overlayfs, both nodefs and imagefs are reported without any issue. ( I used dedicated partition/lvm and mounted under /var/lib/containerd prior to start containerd ), kubelet summary api shows as expected
/var/lib/containerd $ lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home sys -wi-ao---- 10.00g
overlayfs sys -wi-ao---- <222.38g
root sys -wi-ao---- <232.38g
/var/lib/containerd $ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/sys-overlayfs 223G 1.7G 221G 1% /var/lib/containerd
/var/lib/containerd $ crictl imagefsinfo
{
"status": {
"timestamp": "1622594437236082468",
"fsId": {
"mountpoint": "/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs"
},
"usedBytes": {
"value": "1738137600"
},
"inodesUsed": {
"value": "24849"
}
}
}
$ sudo curl -s --cacert /etc/pki/tls/certs/ca.pem --cert /etc/pki/tls/certs/kube-admin.pem --key /etc/pki/tls/private/kube-admin-key.pem https://kubenode20:10250/stats/summary/ |grep '"fs":' -A 20
"fs": {
"time": "2021-06-02T00:38:17Z",
"availableBytes": 228123242496,
"capacityBytes": 249462521856,
"usedBytes": 21339279360,
"inodesFree": 121704781,
"inodes": 121833472,
"inodesUsed": 128691
},
"runtime": {
"imageFs": {
"time": "2021-06-02T00:38:17Z",
"availableBytes": 236875235328,
"capacityBytes": 238660943872,
"usedBytes": 1738137600,
"inodesFree": 116565592,
"inodes": 116590592,
"inodesUsed": 24849
}
},
"rlimit": {
with cri+devicemapper, there is no separate mount point like above as the block device is used for image storage.
We added DEBUG statement, to find how it is working in the dockerd+devicemapper env, it turns out that the cadvisor is labelling the image fs with docker-images
I0208 22:25:24.245691 6403 cadvisor_stats_provider.go:230] [DEBUG] inside ImageFsStats for cadvisor
I0208 22:25:24.245694 6403 cadvisor_linux.go:161] [DEBUG] label from ImageFsInfo "docker-images"
I0208 22:25:24.245696 6403 manager.go:732] [DEBUG] LABEL from GetFsInfo docker-images
I0208 22:25:24.245698 6403 manager.go:733] [DEBUG] LABEL LENGTH from GetFsInfo %!s(int=13)
I0208 22:25:24.245702 6403 manager.go:737] [DEBUG] DEV for LABEL from GetFsInfo "sys-docker--pool"
I0208 22:25:24.245707 6403 manager.go:749] [DEBUG] mountpoint for fs.Device is ""
I0208 22:25:24.245709 6403 manager.go:754] [DEBUG] labels for fs.Device is []string{"docker-images"}
I0208 22:25:24.245718 6403 manager.go:774] [DEBUG] fsInfo is []v2.FsInfo{v2.FsInfo{Timestamp:time.Time{wall:0xc0008c750c778087, ext:185285290911, loc:(*time.Location)(0x7082040)}, Device:"sys-docker--pool", Mountpoint:"", Capacity:0xc68e000000, Available:0xba1f200000, Usage:0xc6ee00000, Labels:[]string{"docker-images"}, Inodes:(*uint64)(nil), InodesFree:(*uint64)(nil)}}
I0208 22:25:24.245724 6403 cadvisor_linux.go:174] [DEBUG] res from GetFsInfo is []v2.FsInfo{v2.FsInfo{Timestamp:time.Time{wall:0xc0008c750c778087, ext:185285290911, loc:(*time.Location)(0x7082040)}, Device:"sys-docker--pool", Mountpoint:"", Capacity:0xc68e000000, Available:0xba1f200000, Usage:0xc6ee00000, Labels:[]string{"docker-images"}, Inodes:(*uint64)(nil), InodesFree:(*uint64)(nil)}}
I0208 22:25:24.245729 6403 cadvisor_stats_provider.go:232] [DEBUG] imageFsInfo v2.FsInfo{Timestamp:time.Time{wall:0xc0008c750c778087, ext:185285290911, loc:(*time.Location)(0x7082040)}, Device:"sys-docker--pool", Mountpoint:"", Capacity:0xc68e000000, Available:0xba1f200000, Usage:0xc6ee00000, Labels:[]string{"docker-images"}, Inodes:(*uint64)(nil), InodesFree:(*uint64)(nil)}
This code path (look up based on label) is not used with cri+devicemapper
Environment details
Kubernetes 1.17.14 Containerd: 1.4.6 OS: RHEL7.9
Devicemapper is setup based on container-storage-setup as mentioned in https://github.com/containerd/containerd/blob/master/snapshots/devmapper/README.md
It appears to be the kubelet/ cadvisor is using
fsId
and trying to look up the mount point to find out the imageFS details. And eventually, end up being the same mount as the node fs.Issue
This causes the kubelet's eviction manager not working for imagefs, result in imagefs Full. We have to manually delete the unused images using
crictl rmi --prune
We have the same devicemapper with Dockerd as CRI and imageFS is being reported correctly with that. It seems there is a cadvisor code to detect devicemapper volume usage info with Dockerd. With CRI that code path is not used.
We tested this with CRI-O as well, same issue noted. The imagefsinfo is same as nodefs for cri-o and containerd runtimes.