Open dragonTour opened 1 year ago
Is your k8s cluster a real cluster? Or a minikube version?
I can't find an environment exactly like yours to reproduce the problem in the short term. Maybe You can try to modify(e.g. comment out some configuration) the cadvisor.yaml and redeploy it.
I used kubeadm to boot the cluster
我把docker的运行数据的目录改了,不在/var/ 下面,是不是这个引起的
[root@]# more /etc/docker/daemon.json
{
"data-root": "/data/docker",
"exec-opts": [
"native.cgroupdriver=systemd"
]
}
Is the original value of data-root '/var/lib/docker' ? If so, maybe You need to change the cadvisor.yaml :
volumes:
...
- name: docker
hostPath:
path: /var/lib/docker
...
to
volumes:
...
- name: docker
hostPath:
path: /data/docker
...
change readOnly to false, run successfully
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cadvisor
namespace: holoinsight-example
spec:
selector:
matchLabels:
app: cadvisor
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
app: cadvisor
hi_common_version: '3'
spec:
restartPolicy: Always
containers:
- name: cadvisor
image: gcr.io/cadvisor/cadvisor:v0.44.0
args:
- --allow_dynamic_housekeeping=false
- --housekeeping_interval=5s
- --max_housekeeping_interval=5s
- --storage_duration=2m
- --enable_metrics=cpu,memory,network,tcp,disk,diskIO,cpuLoad
- --enable_load_reader=true
- --store_container_labels=false
volumeMounts:
- name: rootfs
mountPath: /rootfs
readOnly: false
- name: var-run
mountPath: /var/run
readOnly: false
- name: sys
mountPath: /sys
readOnly: true
- name: docker
mountPath: /var/lib/docker
readOnly: false
- name: disk
mountPath: /dev/disk
readOnly: true
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
requests:
cpu: "0"
memory: "0"
limits:
cpu: "0.25"
memory: "256Mi"
volumes:
- name: rootfs
hostPath:
path: /
- name: var-run
hostPath:
path: /var/run
- name: sys
hostPath:
path: /sys
- name: docker
hostPath:
path: /data/docker
- name: disk
hostPath:
path: /dev/disk
The volumeMounts
config in cadvisor yaml are copied from cadvisor official repository without any changes.
And our internal deployments (through Aliyun k8s cluster) are all successful with this cadvisor config.
I think there is some special particularity in your k8s cluster, leading to deployment failure.
If you would like to explore the root cause of this issue, and contribute a corresponding solution, then this is quite welcome.
dragonTour is not alone. I'm seeing this same issue in EKS 1.24 which uses containerd runtime.
cadvisor:
Container ID: containerd://80ad9ce8b85e077f50dd9c1bfd1e248801afa3126f94793b91bbdb5ea33acf29
Image: gcr.io/cadvisor/cadvisor:v0.49.1
Image ID: gcr.io/cadvisor/cadvisor@sha256:3cde6faf0791ebf7b41d6f8ae7145466fed712ea6f252c935294d2608b1af388
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: StartError
Message: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/var/lib/kubelet/pods/882dfec1-613f-4a83-8705-424230f18271/volumes/kubernetes.io~projected/kube-api-access-phx22" to rootfs at "/var/run/secrets/kubernetes.io/serviceaccount": mkdir /run/containerd/io.containerd.runtime.v2.task/k8s.io/80ad9ce8b85e077f50dd9c1bfd1e248801afa3126f94793b91bbdb5ea33acf29/rootfs/run/secrets: read-only file system: unknown
Describe this problem
cadvisor's pod failed to run
Viewing pod(cadvisor-kpxbl) logs:
Steps to reproduce
kubernetes version:1.23 docker version: 20.10.6 linux kernal: 4.18.0-1.el7.elrepo.x86_64
Expected behavior
No response
Additional Information
No response