Pod restore failed in k8s cluster based on CRI-O as container runtime

WhaleSpring commented 1 year ago

What happened?

When I try to run a container by a .tar which produced by the ability that kubelet provide , it was wrong: The pod restore failed, kubectl describe could get such messages: It seems that the criu try to restore the pod according to the container_name ‘m’ ,but ofcourse the container's path is using its ID instead name in the path mentioned above. Becausecontainer file names in the path mentioned in messages are container IDsin the node. I think there may be some settings not being set correctly, because I have a node can restore pod, but I don't know what's the different between it and other machines. It is necessary to add that I have one node Node1 can checkpoint and restore a container , but other nodes can only checkpoint a container but can't restore one . Meanwhile , a image built by the .tar file from the nodes which can't restore container can be restored at Node1. This tells me the checkpoint is correct at other nodes but restore is wrong. I don't know why Node1 can restore and why the other nodes can't do it.

It's neccessay to point that my OS is CentOS7 and when I try to use the newest Ubuntu, there is the same problem.

What did you expect to happen?

I want to live migrate a pod , but when I try to restore a pod, I meet problem above . For what I can only restore pods in one mechine of my cluster.

How can we reproduce it (as minimally and precisely as possible)?

At first, I use checkpoint to get a checkpoint file: curl -sk -X POST "https://localhost:10250/checkpoint/<namespace>/<pod_name>/<conatiner_name>" \ --key /etc/kubernetes/pki/apiserver-kubelet-client.key \ --cacert /etc/kubernetes/pki/ca.crt \ --cert /etc/kubernetes/pki/apiserver-kubelet-client.crt And then I build image based on the checkpoint.tarfile and push it to the image registry. At last, I deploy the image as a new pod. If I chose my Node1 then the pod will be restored successfully, but other nodes failed with messages above.

Anything else we need to know?

No response

CRI-O and Kubernetes version

$ crio --version CRI-O 1.27.1 $ kubectl version Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.28.1

OS version

$ cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/" CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" $ uname -a Linux master 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Additional environment details (AWS, VirtualBox, physical, etc.)

I use CRIU 3.17.1 as the criu of cri-o using to checkpoint and restore.

adrianreber commented 1 year ago

Please provide the yaml you use to create your pod initially and also for restore.

I also need to see the criu log files. restore.log. Using the criu configuration file you can create the log file somewhere else so that it is not deleted.

WhaleSpring commented 1 year ago

@adrianreber I get the log file with such context:

(00.000000) Unable to get $HOME directory, local configuration file will not be used.
(00.000000) Parsing config file /etc/criu/runc.conf
(00.000100) Version: 3.17.1 (gitid 0)
(00.000114) Running on node1 Linux 3.10.0-1160.90.1.el7.x86_64 #1 SMP Thu May 4 15:21:22 UTC 2023 x86_64
(00.000117) Would overwrite RPC settings with values from /etc/criu/runc.conf
(00.000149) Loaded kdat cache from /run/criu.kdat
(00.000197) Hugetlb size 2 Mb is supported but cannot get dev's number
(00.000211) Hugetlb size 1024 Mb is supported but cannot get dev's number
(00.000621) Added ipc:/var/run/ipcns/ca89c2be-d87d-45e4-84b3-a858c1ac9bf0 join namespace
(00.000644) Added uts:/var/run/utsns/ca89c2be-d87d-45e4-84b3-a858c1ac9bf0 join namespace
(00.000684) Parsing config file /etc/criu/runc.conf
(00.000729) mnt-v2: Mounts-v2 requires MOVE_MOUNT_SET_GROUP support
(00.000735) Mount engine fallback to --mntns-compat-mode mode
(00.000755) rlimit: RLIMIT_NOFILE unlimited for self
(00.001041) Error (criu/lsm.c:411): selinux LSM specified but selinux not supported by kernel

It seems be the problem of selinux or LSM. I still don't understand what's wrong，for I really don't learning criu too long.

(00.001041) Error (criu/lsm.c:411): selinux LSM specified but selinux not supported by kernel

adrianreber commented 1 year ago

Has the destination host the same OS?

adrianreber commented 1 year ago

Probably not. You cannot migrate from a host with selinux to a host without selinux.

WhaleSpring commented 1 year ago

Has the destination host the same OS?

Yes，my hosts are all centos7，and hosts that can't restore successfully can't even restore pods in place after checkpoint(no cross-host restore)

WhaleSpring commented 1 year ago

Probably not. You cannot migrate from a host with selinux to a host without selinux.

For the sake that I have a host which can restore successsfully , maybe I should compare the situation of selinux between the host which can restore a pod and the hosts which can't. But how? As far as I can tell, they are the same.

WhaleSpring commented 1 year ago

@adrianreber Oh! I found the kernel version of the servers are not the same: The host with kernel-3.10.0-957.el7.x86_64 can restore successfully, when the hosts with e can't restore a pod. Is that the problem? Maybe kernel-3.10.0-1160.90.1.el7.x86_64 can't be the host's kernel version to restore a pod ? I will run further tests to see if this is the case.

adrianreber commented 1 year ago

Maybe you have selinux disabled on.one host.

WhaleSpring commented 1 year ago

Maybe you have selinux disabled on.one host.

Both the selinux status in hosts that can or can't are all as follows, so it's not the disabled question:

SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      31

adrianreber commented 1 year ago

Well, CRIU thinks SELinux is disabled. So something must be different. Anyway, do not use CentOS 7. It is too old.

WhaleSpring commented 1 year ago

Hi! @adrianreber ! Thank you for your previous guidance！ Following your advice, I changed my system. Now I am doing my experiment on the k8s cluster composed of several Ubuntu 22.04.03 virtual machines! But the restore process still failed. Here are the logs I got： criu3.log The problem I encountered seems different from MaxFuhrich which is a cgroupv2 problem. If you are free, could you please help me see how I can adjust the configuration of the virtual machine to ensure that the restore operation can proceed correctly?

adrianreber commented 1 year ago

You get a segfault during restore which is usually a sign of problems with restartable sequences.

You need at least CRIU 3.17.

WhaleSpring commented 1 year ago

You get a segfault during restore which is usually a sign of problems with restartable sequences.

You need at least CRIU 3.17.

That's precisely the issue. I upgraded CRIU from version 3.16.1 to version 3.17.1, and everything is working fine now. Thank you very much!

It's worth noting that Kubernetes logs previously advised me that the minimum required CRIU version for C/R operations was 3.16, which led to a misguidance in my version selection for attempt this time.

adrianreber commented 1 year ago

It's worth noting that Kubernetes logs previously advised me that the minimum required CRIU version for C/R operations was 3.16, which led to a misguidance in my version selection for attempt this time.

Well, the functionality to checkpoint/restore containers in Kubernetes requires 3.16. If your distribution uses restartable sequences you need 3.17. This was just a bug in your distribution. Ubuntu comes with a known broken CRIU version. There is a bug open for it somewhere but they ignore it.

adrianreber commented 1 year ago

Please close this issue if your problem is solved.

cri-o / cri-o