Ljohn001 / blogs

Github Pages: https://www.ljohn.cn/
0 stars 0 forks source link

kubelet统计磁盘inode耗时导致负载升高问题 | Ljohn's Blog #24

Open Ljohn001 opened 1 year ago

Ljohn001 commented 1 year ago

https://www.ljohn.cn/posts/b0f608c0/

问题描述1、磁盘容量告警 2、对主机进行磁盘检查并尝试清理磁盘占用较高的pod,有5900多个文件描述符被打开没有关闭。 通过平台删除该pod: product-center-query-pro-remain-5bfb5f4c98-m2l55 3、主机负载冲高,触发告警 根因排查1、根据系统日志发现磁盘最早在5月28日就出现了磁盘inode统计超时情况。 2、发现有个容器在/tmp目录创

liugb1029 commented 2 months ago

disk_size_sort_10 这个是自己写的吗?

Ljohn001 commented 2 months ago

自己写的,脚本

#!/bin/bash

#  描述:查询主机上相关容器服务的存储大小,过滤出pod临时存储和docker临时存储,占用前10的应用
# 修改时间: 2023-02-01

do_getdisk_crictl(){

# 查找pod里挂载的volum目录数据,包括emptydir ,secret。emptydir中包含logkit-data目录
du -sh /var/lib/kubelet/pods/* |sort -rh |head -n 5 |while read v
do
vpath=$(echo $v |awk '{print $2}')
vsize=$(echo $v |awk '{print $1}')
crictl ps -q | xargs crictl inspect --output go-template --template ' {{.info.pid}} {{ index .info.config.labels "io.kubernetes.pod.name"}} {{ index .info.config.labels "io.kubernetes.container.name"}} {{ index .info.config.labels "io.kubernetes.pod.uid"}} ' | grep ${vpath##*/} | awk -v size=$vsize -v path=$vpath  '{ printf "SIZE: %s, PID: %s, PodName %s, ContainerName: %s, Path: %s\n", size,$1,$2,$3,path } '
done

#清理containerd运行及镜像分层中的rocketmq_client
du -sh /run/containerd/io.containerd.runtime.v2.task/k8s.io/* |sort -rh |head -n 5 |while read v
do
vpath=$(echo $v |awk '{print $2}')
vsize=$(echo $v |awk '{print $1}')
crictl ps -q | xargs crictl inspect --output go-template --template ' {{.info.pid}} {{ index .info.config.labels "io.kubernetes.pod.name"}} {{ index .info.config.labels "io.kubernetes.container.name"}} {{ .status.id}} ' |grep  ${vpath##*/} | awk -v size=$vsize -v path=$vpath  '{ printf "SIZE: %s, PID: %s, PodName %s, ContainerName: %s, Path: %s\n", size,$1,$2,$3,path } '
done

}

do_getdisk_docker(){
# 查找pod里挂载的volum目录数据,包括emptydir ,secret。emptydir中包含logkit-data目录
du -sh /var/lib/kubelet/pods/* |sort -rh |head -n 5 |while read v
do
vpath=$(echo $v |awk '{print $2}')
vsize=$(echo $v |awk '{print $1}')
docker ps -q | xargs docker inspect --format '{{.State.Pid}} {{.Config.Hostname}} {{index .Config.Labels "io.kubernetes.container.name"}}  {{.HostsPath }} ' | grep ${vpath##*/} | awk -v size=$vsize -v path=$vpath  '{ printf "SIZE: %s, PID: %s, PodName %s, ContainerName: %s, Path: %s\n", size,$1,$2,$3,path } '
done
# pod容器基础镜像层数据和容器运行时产生的临时数据,diff 目录为容器的读写层,容器内修改的文件都会在 diff 中出现,merged 目录为分层文件联合挂载后的结果,也是容器内的工作目录。
du -sh  /var/lib/docker/overlay2/*  |sort -rh |head -n 5 |while read v
do
vpath=$(echo $v |awk '{print $2}')
vsize=$(echo $v |awk '{print $1}')
docker ps -q | xargs docker inspect --format '{{.State.Pid}} {{.Config.Hostname}}  {{index .Config.Labels "io.kubernetes.container.name"}}  {{.GraphDriver.Data.WorkDir }}  ' | grep ${vpath##*/} | awk -v size=$vsize -v path=$vpath  '{ printf "SIZE: %s, PID: %s, PodName %s, ContainerName: %s, Path: %s\n", size,$1,$2,$3,path } '
done

}

which crictl &>/dev/null
if [ $? -ne 0 ]; then
        do_getdisk_docker
    else
        do_getdisk_crictl
fi