kmesh-net / kmesh

High Performance ServiceMesh Data Plane Based on Programmable Kernel
https://kmesh.net
Apache License 2.0
424 stars 58 forks source link

Kmesh daemon restart once after updating image #687

Closed hzxuzhonghu closed 1 month ago

hzxuzhonghu commented 1 month ago

What happened:

I have seen kmesh daemon start failed after updating its image

root@kurator-linux-0001:~/sample# k logs kmesh-m5sst  -n kmesh-system
mkdir: cannot create directory '/mnt/kmesh_cgroup2': File exists
none on /sys/fs/bpf type bpf (rw,relatime)
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --bpf-fs-path=\"/sys/fs/bpf\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --cgroup2-path=\"/mnt/kmesh_cgroup2\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --cni-etc-path=\"/etc/cni/net.d\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --conflist-name=\"\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --enable-bpf-log=\"true\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --enable-bypass=\"false\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --enable-mda=\"false\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --enable-secret-manager=\"false\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --help=\"false\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --mode=\"workload\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="FLAG: --plugin-cni-chained=\"true\"" subsys=manager
time="2024-08-05T11:43:32Z" level=info msg="oldGitVersion: 1563676948 newGitVersion: 2800172400" subsys=pkg/bpf
time="2024-08-05T11:43:32Z" level=info msg="kmesh start with Update" subsys=pkg/bpf
time="2024-08-05T11:43:32Z" level=info msg="Clean kmesh_version map and bpf prog" subsys=pkg/bpf
time="2024-08-05T11:43:32Z" level=error msg="bpf Load failed, pin prog failed, file exists" subsys=main
Error: bpf Load failed, pin prog failed, file exists
kmesh exit

After a while, kmesh daemon restarts once.

 k get pod -n kmesh-system
NAME          READY   STATUS    RESTARTS      AGE
kmesh-m5sst   1/1     Running   1 (11m ago)   12m

What you expected to happen:

Kmesh daemon should not be restarted, should not met any error when updating

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

hzxuzhonghu commented 1 month ago

/assign @lec-bit