checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.93k stars 585 forks source link

CRIU dump with process running on nfs share mounted with root_squash option. #1586

Open leomem opened 3 years ago

leomem commented 3 years ago

When I use criu to dump a process running on a nfs share mounted with root-squash option as non-root user, dumping fails no matter whether criu runs as root or that user. The same process can be dumped if it runs from local disk with the same user.

When using criu to dump the process as root:

# ./criu dump -t 2582--shell-job
Error (criu/proc_parse.c:447): Can't open map_files: Permission denied
Error (criu/proc_parse.c:641): Can't open 2582's mapfile link 400000: Permission denied
Error (criu/cr-dump.c:1262): Collect mappings (pid: 2582) failed with -1
Error (criu/cr-dump.c:1781): Dumping FAILED.

When a nfs share is mounted with root-squash option, user root is mapped to nfsnobody and does not have access to the process binary or any files on the nfs share opened by the process. That is probably where the "permission denied" error come from. Is there any option to get around this issue? Thanks a lot.

Snorch commented 3 years ago

Didn't check this, but probably this patch would help:

diff --git a/criu/proc_parse.c b/criu/proc_parse.c
index f3491e781..20117d5a6 100644
--- a/criu/proc_parse.c
+++ b/criu/proc_parse.c
@@ -361,6 +361,7 @@ static int vma_get_mapfile(const char *fname, struct vma_area *vma, DIR *mfd, st
 {
        char path[32];
        int flags;
+       bool retried = false;

        /* Figure out if it's file mapping */
        snprintf(path, sizeof(path), "%" PRIx64 "-%" PRIx64, vma->e->start, vma->e->end);
@@ -411,6 +412,7 @@ static int vma_get_mapfile(const char *fname, struct vma_area *vma, DIR *mfd, st
                 */
                flags = O_RDONLY;

+retry_o_path:
        *vm_file_fd = openat(dirfd(mfd), path, flags);
        if (*vm_file_fd < 0) {
                if (errno == ENOENT)
@@ -445,6 +447,11 @@ static int vma_get_mapfile(const char *fname, struct vma_area *vma, DIR *mfd, st
                        return vma_get_mapfile_user(fname, vma, vfi, vm_file_fd, path);

                pr_perror("Can't open map_files");
+               if (!retried) {
+                       flags = O_PATH;
+                       retried = true;
+                       goto retry_o_path;
+               }
                return -1;
        }

We try to open with O_RDONLY and you probably don't have read access to the file. But we actually needed this O_RDONLY for special files (sockets and aio rings) only which is probably not your case and everything else would just work.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

bodgerer commented 2 years ago

Hi there,

I'm seeing a similar issue. Using criu 3.17.1, running as root, trying to dump a non-root bash script and where the script file is on a root-squashed nfs filesystem and not accessible by root, I see:

# criu dump -D save2 --shell-job -t 57981
Error (criu/files-reg.c:1347): Can't stat path: Permission denied
Error (criu/cr-dump.c:1635): Dump files (pid: 57981) failed with -1
Error (criu/cr-dump.c:2053): Dumping FAILED.

Any ideas, please? Thanks!

girpierr commented 4 months ago

Hi all, I'm facing the same issue because the executed binary is stored on a nfs share moutend with root-squash. So, I tried the patch proposed by Snorch. That works for this step of the checkpointing, but that fails later because of file stating operations.

(00.058202) Error (criu/proc_parse.c:475): Can't open map_files: Permission denied
(00.058526) Found regular file mapping, OK
(00.058733) Dumping path for -3 fd via self 11 [/applis/site/cecic/gaussian16/B.01/sandybridge/g16/l502.exe]
(00.059869) Error (criu/files-reg.c:1426): Can't stat path: Permission denied
(00.059879) Error (criu/cr-dump.c:1563): Collect mappings (pid: 51533) failed with -1

I then decided to follow a new approch, as the process owns read right, I thought that "seteuiding" with the process UID could be a solution. I then made an ugly proof of the concept by retrying the stat operation by using my own UID (hardcoded in the CRIU code, so I compiled a "personal" CRIU). I had to customize criu/proc_parse.c and criu/files-reg.c.

The checkpointing is now working !!

But the restore is still problematic. I tried to follow the same approch, but I'm not anymore able to compile CRIU... because of the PIE part compiled with -nostdlib:

Error (compel/src/lib/handle-elf-host.c:337): Unexpected undefined symbol: `seteuid'. External symbol in PIE?
make[2]: *** [criu/pie/Makefile:58: criu/pie/restorer-blob.h] Error 255
make[1]: *** [criu/Makefile:59: pie] Error 2
make: *** [Makefile:267: criu] Error 2

I was not able to solve this compilation problem. So, now, I can checkpoint but I can't restore. Any suggestion is welcome !! Thanks Pierre