checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.87k stars 582 forks source link

Error when dumping singularity container #1754

Open Wosch96 opened 2 years ago

Wosch96 commented 2 years ago

Hello guys,

I'm trying to checkpoint and restore a singularity container with criu. I get an error when dumping the container and maybe you could help me out. I'm running criu with the following command when trying to dump the container.

criu dump -o dump.log -v4 -t 7209 -D ./ --ext-mount-map /etc/resolv.conf:/etc/resolv.conf --ext-mount-map /etc/hosts:/etc/hosts --ext-mount-map /etc/hostname:/etc/hostname --ext-mount-map /var/tmp:/var/tmp --ext-mount-map /tmp:/tmp --ext-mount-map /root:/root --ext-mount-map /etc/localtime:/etc/localtime --ext-mount-map /tmp:/tmp --ext-mount-map /sys:/sys --ext-mount-map /proc:/proc --ext-mount-map /dev:/dev --ext-mount-map /dev/hugepages:/dev/hugepages --ext-mount-map /dev/mqueue:/dev/mqueue --ext-mount-map /dev/pts:/dev/pts --ext-mount-map /dev/shm:/dev/shm

This is the error that occurs in the dump.log. (00.002619) Error (criu/files-reg.c:1629): Can't lookup mount=39 for fd=-3 path=/usr/local/libexec/singularity/bin/starter (00.002629) Error (criu/cr-dump.c:1262): Collect mappings (pid: 7209) failed with -1

Here is the whole dump.log.

Thank you in advance.

adrianreber commented 2 years ago

@Snorch and ideas what might be going on?

Wosch96 commented 2 years ago

Just FYI: I'm running singularity-ce version 3.9.5, the newest one. Did the checkpointing work with older versions of singularity?

Cheers.

adrianreber commented 2 years ago

I am not aware that CRIU ever worked with singularity. No one spoke to us about singularity.

adrianreber commented 2 years ago

For crun and runc we have integration of CRIU directly in the container runtime. Checkpointing without the help of the container runtime is always challenging.

Wosch96 commented 2 years ago

Yeah, I heard about that. I'm doing this for my own project to get CRIU running with singularity. This issuer here tried the same, but he found a way to solve it. The error in my case is a little different and could be related to the newer version of singularity.

Snorch commented 2 years ago

Error (criu/files-reg.c:1629): Can't lookup mount=39 for fd=-3 path=/usr/local/libexec/singularity/bin/starter

This looks like a file mapping on detached mount (known problem).

E.g. I can reproduce the same error with a simple bash script in Virtuozzo container:

CT-2b5b6c67-d666-4950-abd0-8c0ceca03d96 /# cat prepare_detached.sh
mount -t tmpfs tmpfs /mnt/
touch /mnt/test
setsid sleep 1000 &>/dev/null </mnt/test &
umount -l /mnt
CT-2b5b6c67-d666-4950-abd0-8c0ceca03d96 /# bash prepare_detached.sh
CT-2b5b6c67-d666-4950-abd0-8c0ceca03d96 /# logout
exited from CT 2b5b6c67-d666-4950-abd0-8c0ceca03d96
You have new mail in /var/spool/mail/root
[root@silo ~]# vzctl suspend testct
Setting up checkpoint...
(00.396621) Error (criu/files-reg.c:2195): Can't lookup mount=638 sdev=174 for fd=0 path=/test
(00.396632) Error (criu/cr-dump.c:1868): Dump files (pid: 248003) failed with -1
(00.416616) Error (criu/cr-dump.c:2311): Dumping FAILED.
Failed to checkpoint the Container
All dump files and logs were saved to /vz/private/2b5b6c67-d666-4950-abd0-8c0ceca03d96/dump/Dump.fail
Checkpointing failed

So if mount on which process has an open file was lazy-umounted criu just can't c/r this process, unless the file is closed.

upd:

Other option is that your file mapping can be external (file outside of container), proper --external file[] + --inherit-fd should be provided by container environment in this case.

Wosch96 commented 2 years ago

Other option is that your file mapping can be external (file outside of container), proper --external file[] + --inherit-fd should be provided by container environment in this case.

@Snorch So I should be able to checkpoint the container with these two commands? In my case there's not really a problem with a file or am I wrong there? Can I use these two commands --external file[] + --inherit-fd to bypass the lookup mount problem? The other external mount map commands do their job correct.

Thank you for the help.

Snorch commented 2 years ago

@Wosch96 please see examples and explanations about how and when --external file[] + --inherit-fd can be used to handle external files in this article https://criu.org/Inheriting_FDs_on_restore

In simple words file is external if it was not opened/mmaped inside container. This way criu can't find it inside container, and can't automatically restore it. in this situations the container manager which should know all files which it puts into container from host would be able to provide criu information from where to take this file from host on restore (via the mentioned options).

Wosch96 commented 2 years ago

In my environment, I don't have a container manager, as I build a singularity container from scratch with a definition file and then try to checkpoint the application that I'm running inside this container. Therefore I'm not really sure how to use the external files and --inherit-fd in my case. Could you clarify that? I read the acrticle about "inheriting FD's on restore" but it's my first time using CRIU so I don't get the exact usage for my example case.

In my case: My dump file tells me about this file /usr/local/libexec/singularity/bin/starter where no mount look up is possible. So when dumping, I would use a command like this? use --external file[39:inode] But I'm not sure how to check the mount id and the "Inode"? I assume the mount id in my dump.log tells me it's 39?

When I restart the process I should work with something like this? --inherit-fd fd[0]:file[39:...]. As I said before, I'm not really sure about the information that is needed to set the --inherit-fd.

I appreciate the help.

avagin commented 2 years ago

@Wosch96 If you need to dump only one application and don't need to dump the container itself, it can be easier to run criu from the container and avoid all these problems with mounts and external fds.

Wosch96 commented 2 years ago

@avagin Yes, I already thought so, but for my project I want to test container dumping. So I still would like to try it that way, if it's only for reasearch purposes. Any advices?

avagin commented 2 years ago

@Wosch96 I recommend to look at how C/R is implemented in runc: https://github.com/opencontainers/runc/blob/main/libcontainer/container_linux.go#L767-L1894

The most complicated part is how to handle external resources (mounts, file descriptors and etc).

Wosch96 commented 2 years ago

Little update by my side, I got the dump running. I solved it like in this issue. But I'm still having a problem when restoring.

After running this command: strace -o strace.log -s 256 -f criu restore -o restore.log -v4 -D ./ --shell-job --root /home/node2/container/criu_checkpoints/criu_container_namespace --ext-mount-map /etc/resolv.conf:/home/node2/container/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /etc/hosts:/home/node2/container/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /etc/hostname:/home/node2/container/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /var/tmp:/home/node2/container/criu_checkpoints/criu_container_namespace/var --ext-mount-map /tmp:/home/node2/container/criu_checkpoints/criu_container_namespace/tmp --ext-mount-map /root:/home/node2/container/criu_checkpoints/criu_container_namespace/root --ext-mount-map /etc/localtime:/home/node2/container/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /sys:/home/node2/container/criu_checkpoints/criu_container_namespace/sys --ext-mount-map /proc:/home/node2/container/criu_checkpoints/criu_container_namespace/proc/ --ext-mount-map /dev:/home/node2/container/criu_checkpoints/criu_container_namespace/dev --ext-mount-map /dev/hugepages:/home/node2/container/criu_checkpoints/criu_container_namespace/dev --ext-mount-map /dev/mqueue:/home/node2/container/criu_checkpoints/criu_container_namespace/dev --ext-mount-map /dev/pts:/home/node2/container/criu_checkpoints/criu_container_namespace/dev --ext-mount-map /dev/shm:/home/node2/container/criu_checkpoints/criu_container_namespace/dev --ext-mount-map /etc/group:/home/node2/container/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /etc/passwd:/home/node2/container/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /home/node2:/home/node2/container/criu_checkpoints/criu_container_namespace/home --ext-mount-map /proc/sys/fs/binfmt_misc:/home/node2/container/criu_checkpoints/criu_container_namespace/proc --ext-mount-map /usr/share/zoneinfo/UTC:/home/node2/container/criu_checkpoints/criu_container_namespace/usr

The strace file shows this error. 1272 write(127, "(00.025032) 1272: Opening 0x00000000400000-0x00000000401000 0000000000000000 (41) vma\n", 88) = 88 1272 openat(120, "home/node2/container/matMult", O_RDONLY) = -1 ENOENT (No such file or directory) 1272 write(127, "(00.025092) 1272: Error (criu/files-reg.c:2143): Can't open file home/node2/container/matMult on restore: No such file or directory\n", 134) = 134 1272 write(127, "(00.025113) 1272: Error (criu/files-reg.c:2086): Can't open file home/node2/container/matMult: No such file or directory\n", 123) = 123 1272 write(127, "(00.025131) 1272: Error (criu/mem.c:1349): - Can't open vma\n", 63) = 63

I've fixed all the external mount problems, but I don't know where to drop this file in order to let CRIU find it. Should this also be solved with an external mount? The application file matMult is in /home/node2/container/matMult.

Should i copy it to the root path? /home/node2/container/criu_checkpoints/criu_container_namespace/home/node2/container/matMult

Logs: dump.log restore.log strace.log

adrianreber commented 2 years ago

From how I understand the dump.log the original container had /home/node2 mounted at /home/node2 in the container.

For the restore it seems you are mounting /home/node2/container/criu_checkpoints at /home/node2 and now it seems your binary is not there.

Should i copy it to the root path? /home/node2/container/criu_checkpoints/criu_container_namespace/home/node2/container/matMult

To fix the error it would be enough to just copy /home/node2/container/matMult to /home/node2/container/criu_checkpoints/node2/container/matMult ... (if I am not confused by your paths).

Wosch96 commented 2 years ago

@adrianreber Just for specifying the dump command. I used this command: criu dump -o dump.log -v4 -t 1581 -D ./ --shell-job --ext-mount-map /etc/resolv.conf:/etc/resolv.conf --ext-mount-map /etc/hosts:/etc/hosts --ext-mount-map /etc/hostname:/etc/hostname --ext-mount-map /var/tmp:/var/tmp --ext-mount-map /tmp:/tmp --ext-mount-map /root:/root --ext-mount-map /etc/localtime:/etc/localtime --ext-mount-map /tmp:/tmp --ext-mount-map /sys:/sys --ext-mount-map /proc:/proc --ext-mount-map /dev:/dev --ext-mount-map /dev/hugepages:/dev/hugepages --ext-mount-map /dev/mqueue:/dev/mqueue --ext-mount-map /dev/pts:/dev/pts --ext-mount-map /dev/shm:/dev/shm --ext-mount-map /etc/group:/etc/group --ext-mount-map /etc/passwd:/etc/passwd --ext-mount-map /home/node2:/home/node2 --ext-mount-map /proc/sys/fs/binfmt_misc:/proc/sys/fs/binfmt_misc --ext-mount-map /usr/share/zoneinfo/UTC:/usr/share/zoneinfo/UTC

So the correct restore command would looke something like this? criu restore -o restore.log -v4 -D ./ --shell-job --root /home/node2/container --ext-mount-map /etc/resolv.conf:/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /etc/hosts:/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /etc/hostname:/criu_checkpoints/criu_container_namespace/etc --ext-mount-map /var/tmp:/criu_checkpoints/criu_container_namespace/var For minimal purposes of course. There are still missing the other external mounts.

adrianreber commented 2 years ago

So the correct restore command would look something like this?

Does it work if you try it? :wink:

I am confused, but shouldn't you be using it during restore like this: --ext-mount-map /etc/resolv.conf:/etc/resolv.conf? I am only using it via --external so maybe the older --ext-mount-map has different semantics. Not sure.

I would say you should tell CRIU to mount exactly the same directories at the same location during restore. I would expect that --ext-mount-map is the same during checkpoint and restore in your case where the mountpoints in the container have exactly the same name as their external locations. If you actually want to mount the same directories and files as during checkpointing.

Wosch96 commented 2 years ago

@adrianreber I tried your approach to set the ext-mount -map as before in dumping. First i shortended the dump command as some of the external mounts were not needed. criu dump -o dump.log -v4 -t 1268 -D ./ --shell-job --ext-mount-map /etc/group:/etc/group --ext-mount-map /etc/passwd:/etc/passwd --ext-mount-map /etc/resolv.conf:/etc/resolv.conf --ext-mount-map /var/tmp:/var/tmp --ext-mount-map /tmp:/tmp --ext-mount-map /home/node2:/home/node2 --ext-mount-map /proc/sys/fs/binfmt_misc:/proc/sys/fs/binfmt_misc --ext-mount-map /proc:/proc --ext-mount-map /etc/hosts:/etc/hosts --ext-mount-map /usr/share/zoneinfo/UTC:/usr/share/zoneinfo/UTC --ext-mount-map /dev/mqueue:/dev/mqueue --ext-mount-map /dev/hugepages:/dev/hugepages --ext-mount-map /dev/pts:/dev/pts --ext-mount-map /dev/shm:/dev/shm --ext-mount-map /dev:/dev

Then I used this restore command: strace -o strace.log -s 256 -f criu restore -o restore.log -v4 -D ./ --shell-job --root ./ --ext-mount-map /etc/group:/etc/group --ext-mount-map /etc/passwd:/etc/passwd --ext-mount-map /etc/resolv.conf:/etc/resolv.conf --ext-mount-map /var/tmp:/var/tmp --ext-mount-map /tmp:/tmp --ext-mount-map /home/node2:/home/node2 --ext-mount-map /proc/sys/fs/binfmt_misc:/proc/sys/fs/binfmt_misc --ext-mount-map /proc:/proc --ext-mount-map /etc/hosts:/etc/hosts --ext-mount-map /usr/share/zoneinfo/UTC:/usr/share/zoneinfo/UTC --ext-mount-map /dev/mqueue:/dev/mqueue --ext-mount-map /dev/hugepages:/dev/hugepages --ext-mount-map /dev/pts:/dev/pts --ext-mount-map /dev/shm:/dev/shm --ext-mount-map /dev:/dev I'm not sure with the root path. What should I use there? In this case the root path is /home/node2/container/criu_checkpoints. That's where i use the command. The container root path is /home/node2/container.

I'm getting this error in the strace: 1268 write(4, "(00.014934) 1268: mnt: \tMounting unsupported @/tmp/.criu.mntns.PvR0Tp/9-0000000000/usr/share/zoneinfo/UTC (0)\n", 112) = 112 1268 write(4, "(00.014952) 1268: mnt: \tBind /usr/share/zoneinfo/UTC to /tmp/.criu.mntns.PvR0Tp/9-0000000000/usr/share/zoneinfo/UTC\n", 118) = 118 1268 mount("/usr/share/zoneinfo/UTC", "/tmp/.criu.mntns.PvR0Tp/9-0000000000/usr/share/zoneinfo/UTC", NULL, MS_BIND, NULL) = -1 ENOENT (No such file or directory) I set this external mount, so I can't explain this to me.

adrianreber commented 2 years ago

Please show /proc/PID/mountinfo from one of the processes in the container.

Wosch96 commented 2 years ago

Here it is: 205 152 0:44 / / rw,nodev,relatime unbindable - overlay overlay ro,seclabel,lowerdir=/usr/local/var/singularity/mnt/session/overlay-lowerdir:/usr/local/var/singularity/mnt/session/rootfs 209 205 0:5 / /dev rw,nosuid master:107 - devtmpfs devtmpfs rw,seclabel,size=495600k,nr_inodes=123900,mode=755 210 209 0:18 / /dev/shm rw,nosuid,nodev master:110 - tmpfs tmpfs rw,seclabel 211 209 0:12 / /dev/pts rw,nosuid,noexec,relatime master:113 - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000 212 209 0:36 / /dev/hugepages rw,relatime master:114 - hugetlbfs hugetlbfs rw,seclabel 213 209 0:14 / /dev/mqueue rw,relatime master:115 - mqueue mqueue rw,seclabel 214 205 253:0 /usr/share/zoneinfo/Europe/Berlin /usr/share/zoneinfo/UTC rw,nosuid,nodev,relatime master:104 - xfs /dev/mapper/centos-root rw,seclabel,attr2,inode64,noquota 215 205 253:0 /etc/hosts /etc/hosts rw,nosuid,nodev,relatime master:104 - xfs /dev/mapper/centos-root rw,seclabel,attr2,inode64,noquota 216 205 0:3 / /proc rw,nosuid,nodev,noexec,relatime master:116 - proc proc rw 217 216 0:35 / /proc/sys/fs/binfmt_misc rw,relatime master:117 - autofs systemd-1 rw,fd=22,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=12388 218 205 0:17 / /sys rw,nosuid,nodev,relatime - sysfs sysfs rw,seclabel 220 205 253:0 /home/node2 /home/node2 rw,nosuid,nodev,relatime master:104 - xfs /dev/mapper/centos-root rw,seclabel,attr2,inode64,noquota 221 205 253:0 /tmp /tmp rw,nosuid,nodev,relatime master:104 - xfs /dev/mapper/centos-root rw,seclabel,attr2,inode64,noquota 222 221 0:40 / /tmp/.criu.mntns.bw8pi9 rw,relatime master:140 - tmpfs none rw,seclabel 223 221 0:41 / /tmp/.criu.mntns.fnVLvz rw,relatime master:141 - tmpfs none rw,seclabel 224 221 0:42 / /tmp/.criu.mntns.nNajIA rw,relatime master:142 - tmpfs none rw,seclabel 225 221 0:43 / /tmp/.criu.mntns.PvR0Tp rw,relatime master:143 - tmpfs none rw,seclabel 226 205 253:0 /var/tmp /var/tmp rw,nosuid,nodev,relatime master:104 - xfs /dev/mapper/centos-root rw,seclabel,attr2,inode64,noquota 227 205 0:39 /etc/resolv.conf /etc/resolv.conf rw,nosuid,relatime master:144 - tmpfs tmpfs rw,seclabel,size=16384k,uid=1000,gid=1000 228 205 0:39 /etc/passwd /etc/passwd rw,nosuid,relatime master:144 - tmpfs tmpfs rw,seclabel,size=16384k,uid=1000,gid=1000 229 205 0:39 /etc/group /etc/group rw,nosuid,relatime master:144 - tmpfs tmpfs rw,seclabel,size=16384k,uid=1000,gid=1000 This is the PID from the process I want to checkpoint inside the container.

/usr/share/zoneinfo/Europe/Berlin /usr/share/zoneinfo/UTC

is this the problem?

adrianreber commented 2 years ago

This is the PID from the process I want to checkpoint inside the container.

/usr/share/zoneinfo/Europe/Berlin /usr/share/zoneinfo/UTC

is this the problem?

Could be. Try to mount /usr/share/zoneinfo/Europe/Berlin on /usr/share/zoneinfo/UTC as it was done during checkpointing.

Wosch96 commented 2 years ago

I used this command with restore: --ext-mount-map /usr/share/zoneinfo/UTC:/usr/share/zoneinfo/Europe/Berlin

But still a problem: 1268 write(4, "(00.014159) 1268: mnt: \tBind /usr/share/zoneinfo/Europe/Berlin to /tmp/.criu.mntns.CND5wH/9-0000000000/usr/share/zoneinfo/UTC\n", 128) = 128 1268 mount("/usr/share/zoneinfo/Europe/Berlin", "/tmp/.criu.mntns.CND5wH/9-0000000000/usr/share/zoneinfo/UTC", NULL, MS_BIND, NULL) = -1 ENOENT (No such file or directory) 1268 write(4, "(00.014215) 1268: Error (criu/mount.c:2263): mnt: Can't mount at /tmp/.criu.mntns.CND5wH/9-0000000000/usr/share/zoneinfo/UTC: No such file or directory\n", 154) = 154 1268 statfs("/tmp/.criu.mntns.CND5wH/9-0000000000/usr/share/zoneinfo/UTC", 0x7ffe70c33b70) = -1 ENOENT (No such file or directory) 1268 write(4, "(00.014257) 1268: Error (criu/mount.c:2518): mnt: Unable to statfs /tmp/.criu.mntns.CND5wH/9-0000000000/usr/share/zoneinfo/UTC: No such file or directory\n", 156) = 156

adrianreber commented 2 years ago

This sounds like a problem I had to deal with a couple of times while integrating CRIU in container runtimes/engines.

Historically containter runtimes/engines always just create the destination mount point. In this case it sounds like /usr/share/zoneinfo/UTC does not exist and it probably created automatically by singularity during container create/run.

During restore you have to do the same thing that is happening during create/run. If it is part of the container runtime/engine it can be solved. You now need to create /usr/share/zoneinfo/UTC before CRIU is running. But if it is a nested mount point it might be tricky. Not sure if action-scripts can help here.

You probably need to handle the root directory of your container during checkpoint. It seems to be an overlay directory:

205 152 0:44 / / rw,nodev,relatime unbindable - overlay overlay ro,seclabel,lowerdir=/usr/local/var/singularity/mnt/session/overlay-lowerdir:/usr/local/var/singularity/mnt/session/rootfs

Maybe you also need an --ext-mount-map for your root directory.

Wosch96 commented 2 years ago

@adrianreber But in the container /usr/share/zoneinfo/UTC already exists, as the container works in the user space. I call the CRIU dump after the container is setup and has the directory /usr/share/zoneinfo/UTC.

I don't understand the error why the folder /usr/share/zoneinfo/UTC can't be found. It exists on the container and on my hostsystem.

What should I set as root directory in this case?

adrianreber commented 2 years ago

My current understanding is that CRIU tries to mount all external mounts as they were previously. It fails to mount /usr/share/zoneinfo/UTC because the destination file usr/share/zoneinfo/UTC, relative to the container root, does not exist. You need to provide a root directory which has the mountpoint usr/share/zoneinfo/UTC. Right now I do not remember if this need --root or the correct entry for --ext-mount-map. I would try --root first. Do you have a usr/share/zoneinfo/UTC at the location you specify with --root?

Wosch96 commented 2 years ago

Gave my root directory as root path. strace -o strace.log -s 256 -f criu restore -o restore.log -v4 -D ./ --shell-job --root / --ext-mount-map /etc/group:/etc/group --ext-mount-map /etc/passwd:/etc/passwd --ext-mount-map /etc/resolv.conf:/etc/resolv.conf --ext-mount-map /var/tmp:/var/tmp --ext-mount-map /tmp:/tmp --ext-mount-map /home/node2:/home/node2 --ext-mount-map /proc/sys/fs/binfmt_misc:/proc/sys/fs/binfmt_misc --ext-mount-map /proc:/proc --ext-mount-map /etc/hosts:/etc/hosts --ext-mount-map /dev/mqueue:/dev/mqueue --ext-mount-map /dev/hugepages:/dev/hugepages --ext-mount-map /dev/pts:/dev/pts --ext-mount-map /dev/shm:/dev/shm --ext-mount-map /dev:/dev --ext-mount-map /etc/localtime:/etc/localtime

Now another error appears. mount("sysfs", "/tmp/.criu.mntns.1Sf3Gw/9-0000000000/sys", "sysfs", MS_NOSUID|MS_NODEV|MS_RELATIME, "seclabel") = -1 EBUSY (Device or resource busy) 1300 write(4, "(00.018182) 1300: Error (criu/mount.c:1979): mnt: Unable to mount sysfs /tmp/.criu.mntns.1Sf3Gw/9-0000000000/sys (id=198): Device or resource busy\n", 149) = 149 1300 write(4, "(00.018203) 1300: Error (criu/mount.c:2044): mnt: Can't mount at /tmp/.criu.mntns.1Sf3Gw/9-0000000000/sys: Device or resource busy\n", 133) = 133 1300 write(4, "(00.018221) 1300: mnt: Start with 0:/tmp/.criu.mntns.1Sf3Gw\n", 62) = 62 1300 umount2("/tmp/cr-tmpfs.JncB1G", MNT_DETACH) = 0

I don't know how to handle sysfs... any idea? also an external mapping?

I hope the is an end in sight.

adrianreber commented 2 years ago

Giving your hosts root as --root does not sound correct. EBUSY happens if there is already a /sys mounted, which is true if you use your host's --root. This sounds potentially dangerous.

From the original messages it seems like Singularity uses overlayfs for the root file system of the container. You should use that.

Anyway, as we told you in the beginning of this issue, you should try to include this into Singularity and not do it manually. Because now you need to recreate the steps that Singularity does to create the container's root file-system.

You can just use a random directory and copy the content from the overlayfs to that directory and use it as --root.

adrianreber commented 2 years ago

-D / looks wrong. You need to specify the directory where the checkpoint files are.

Wosch96 commented 2 years ago

I tried another restore command with the overlay as the root directory and the external command: strace -o strace.log -s 256 -f criu restore -o restore.log -v4 -D ./ --shell-job --root /usr/local/var/singularity/mnt/session/final --external mnt[/etc/group]:/usr/local/var/singularity/mnt/session/final/etc --external mnt[/etc/passwd]:/usr/local/var/singularity/mnt/session/final/etc --external mnt[/etc/resolv.conf]:/usr/local/var/singularity/mnt/session/final/etc --external mnt[/var/tmp]:/usr/local/var/singularity/mnt/session/final/var --external mnt[/tmp]:/usr/local/var/singularity/mnt/session/final/tmp --external mnt[/home/node2]:/usr/local/var/singularity/mnt/session/final/home --external mnt[/proc/sys/fs/binfmt_misc]:/usr/local/var/singularity/mnt/session/final/proc --external mnt[/proc]:/usr/local/var/singularity/mnt/session/final/proc --external mnt[/etc/hosts]:/usr/local/var/singularity/mnt/session/final/etc --external mnt[/dev/mqueue]:/usr/local/var/singularity/mnt/session/final/dev --external mnt[/dev/hugepages]:/usr/local/var/singularity/mnt/session/final/dev --external mnt[/dev/pts]:/usr/local/var/singularity/mnt/session/final/dev --external mnt[/dev/shm]:/usr/local/var/singularity/mnt/session/final/dev --external mnt[/dev]:/usr/local/var/singularity/mnt/session/final/dev --external mnt[/etc/localtime]:/usr/local/var/singularity/mnt/session/final/etc

Sry that -D ./ was there but I printed the wrong command.

I'm running again into the sysfs mnt problem.

write(4, "(00.015814) 1278: mnt: \tMounting sysfs @/tmp/.criu.mntns.66aMK9/9-0000000000/sys (0)\n", 87) = 87 1278 mount("sysfs", "/tmp/.criu.mntns.66aMK9/9-0000000000/sys", "sysfs", MS_NOSUID|MS_NODEV|MS_RELATIME, "seclabel") = -1 ENOENT (No such file or directory) 1278 write(4, "(00.015901) 1278: Error (criu/mount.c:1979): mnt: Unable to mount sysfs /tmp/.criu.mntns.66aMK9/9-0000000000/sys (id=198): No such file or directory\n", 151) = 151 1278 write(4, "(00.015921) 1278: Error (criu/mount.c:2044): mnt: Can't mount at /tmp/.criu.mntns.66aMK9/9-0000000000/sys: No such file or directory\n", 135) = 135 1278 write(4, "(00.015939) 1278: mnt: Start with 0:/tmp/.criu.mntns.66aMK9\n", 62) = 62

Infos: restore.log strace.log

adrianreber commented 2 years ago

Does /usr/local/var/singularity/mnt/session/final have a sys directory?

Wosch96 commented 2 years ago

The creation of the sys folder fixed it. Now the hopefully last problem...

1278 mmap(NULL, 8520, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fde08e0c000 1278 munmap(0x7fde08e0c000, 8520) = 0 1278 write(127, "(00.030746) 1278: Error (criu/files-reg.c:2104): File home/node2/container/matMult has bad mode 0100755 (expect 0100775)\n", 123) = 123 1278 write(127, "(00.030765) 1278: Error (criu/mem.c:1349): - Can't open vma\n", 63) = 63

The matMult is a compiled application.

adrianreber commented 2 years ago

A chmod should solve that.

Wosch96 commented 2 years ago

Solved but again an error.

(00.028160) Error (criu/cr-restore.c:1931): Can't attach to 1278: Operation not permitted (00.028222) pie: 1278: seccomp: mode 0 on tid 1278 (00.028999) Error (criu/cr-restore.c:1986): Can't interrupt the 1278 task: No such process (00.029019) Error (criu/cr-restore.c:2372): Can't catch all tasks (00.029036) Error (criu/cr-restore.c:2420): Killing processes because of failure on restore. The Network was unlocked so some data or a connection may have been lost. (00.029867) Error (criu/mount.c:3385): mnt: Can't remove the directory /tmp/.criu.mntns.0XDuDp: No such file or directory (00.029904) Error (criu/cr-restore.c:2447): Restoring FAILED.

Before that error, it seems that ptrace is the problem.

1406 ptrace(PTRACE_SEIZE, 1398, NULL, 0) = -1 EPERM (Operation not permitted) 1406 write(4, "(00.033756) Error (criu/cr-restore.c:1931): Can't attach to 1398 : Operation not permitted\n", 90) = 90

adrianreber commented 2 years ago

Try it without strace.

Wosch96 commented 2 years ago

There we go. It works πŸ˜„. I was a bit of a nobrainer there at the end.

@adrianreber Incredible thank you for your help. Sry, for the time you had to waste on me... πŸ˜…

I checkpointed a singularity container that runs a little program. Nothing complex with MPI or anything. As an info for any readers.

adrianreber commented 2 years ago

There we go. It works smile. I was a bit of a nobrainer there at the end.

Nice.

@adrianreber Incredible thank you for your help. Sry, for the time you had to waste on me... sweat_smile

We were making progress with each step so it always felt like it might work in the end.

I checkpointed a singularity container that runs a little program. Nothing complex with MPI or anything. As an info for any readers.

and you restored it, right? Would you be willing to document the commands used in our wiki (criu.org). How you started the container, how you checkpointed and how you restored it. Maybe easier to find than buried here in the ticket. If you have a chance to document it that would be great.

Wosch96 commented 2 years ago

and you restored it, right? Would you be willing to document the commands used in our wiki (criu.org). How you started the container, how you checkpointed and how you restored it. Maybe easier to find than buried here in the ticket. If you have a chance to document it that would be great.

After your time consuming help, of course I can document that. πŸ‘

Should I obtain a user account to edit the wiki?

Or is it maintained with github?

adrianreber commented 2 years ago

Yes, please create an account. I am not sure who has to approve it, but so far it usually happens fast.

@Snorch do you know who needs to approve wiki accounts?

Snorch commented 2 years ago

do you know who needs to approve wiki accounts?

@kolyshkin Was doing that a the time I registered. But I'm a bit unsure.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

jrandall commented 1 year ago

@Wosch96 found this while attempting to get CRIU to dump/restore a singularity container. Were you ever able to write-up the commands used on the wiki? I searched but couldn't find anything.

If you still have any record of how this was done, perhaps you could post it here?