checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.97k stars 596 forks source link

x86: 5-level paging (restore: Page entry address ffff720a1000 outside of VMA ffff720a1000-ffffb40a4000) #706

Open weiwang999 opened 5 years ago

weiwang999 commented 5 years ago

101332: Error (criu/mem.c:990): Trying to restore page for non-private VMA 101332: Error (criu/mem.c:1120): Page entry address ffff720a1000 outside of VMA ffff720a1000-ffffb40a4000

rst0git commented 5 years ago

Hi @weiwang999, would it be possible to provide more information about this issue? What is your Linux distribution, kernel version, CRIU version? What is the application are you checkpointing/restoring? Also it might be helpful to share the complete dump and restore log files.

weiwang999 commented 5 years ago

(00.000001) Version: 3.11 (gitid 9b8b4a4) (00.000023) Running on helios-PowerEdge-R7425 Linux 4.15.0-45-generic #48~16.04.1-Ubuntu SMP Tue Jan 29 18:03:48 UTC 2019 x86_64 (00.000040) Loaded kdat cache from /run/criu.kdat (00.000142) Reading image tree (00.000166) Add mnt ns 5 pid 101332 (00.000196) Add net ns 2 pid 101332 (00.000203) Migrating process tree (GID 101332->7443 SID 54004->20851) (00.000212) Will restore in 0 namespaces (00.000240) Collecting 37/54 (flags 2) (00.000255) Collected [popcorn/brdw/popcorn-npb-is_C] ID 0x1 (00.000261) Collected [popcorn/brdw/popcorn-npb-is_C_x86-64] ID 0x2 (00.000267) Collected [popcorn/brdw/popcorn-npb-is_C_aarch64] ID 0x3 (00.000273) Collected [dev/pts/7] ID 0x5 (00.000281) Collected [popcorn/brdw/popcorn-npb-is_C_aarch64] ID 0x6 (00.000285) Collected [popcorn/brdw/popcorn-npb-is_C_x86-64] ID 0x7 (00.000289) Collected [popcorn/brdw] ID 0x8 (00.000292) Collected [.] ID 0x9 (00.000296) Collecting 43/59 (flags 0) (00.000300) No remap-fpath.img image (00.000388) cg: Preparing cgroups yard (cgroups restore mode 0x4) (00.001265) cg: Determined cgroup dir perf_event/ already exist (00.001270) cg: Skip restoring properties on cgroup dir perf_event/ (00.001283) cg: Determined cgroup dir perf_event//docker already exist (00.001286) cg: Skip restoring properties on cgroup dir perf_event//docker (00.001587) cg: Determined cgroup dir pids/user.slice/user-1000.slice already exist (00.001592) cg: Skip restoring properties on cgroup dir pids/user.slice/user-1000.slice (00.001885) cg: Determined cgroup dir devices/user.slice already exist (00.001889) cg: Skip restoring properties on cgroup dir devices/user.slice (00.002168) cg: Determined cgroup dir hugetlb/ already exist (00.002172) cg: Skip restoring properties on cgroup dir hugetlb/ (00.002181) cg: Determined cgroup dir hugetlb//docker already exist (00.002184) cg: Skip restoring properties on cgroup dir hugetlb//docker (00.002442) cg: Determined cgroup dir memory/user.slice already exist (00.002446) cg: Skip restoring properties on cgroup dir memory/user.slice (00.002705) cg: Determined cgroup dir blkio/user.slice already exist (00.002708) cg: Skip restoring properties on cgroup dir blkio/user.slice (00.002977) cg: Determined cgroup dir freezer/ already exist (00.002981) cg: Skip restoring properties on cgroup dir freezer/ (00.002989) cg: Determined cgroup dir freezer//docker already exist (00.002993) cg: Skip restoring properties on cgroup dir freezer//docker (00.003254) cg: Determined cgroup dir cpuset/ already exist (00.003258) cg: Skip restoring properties on cgroup dir cpuset/ (00.003265) cg: Determined cgroup dir cpuset//docker already exist (00.003269) cg: Skip restoring properties on cgroup dir cpuset//docker (00.003524) cg: Determined cgroup dir net_cls,net_prio/ already exist (00.003528) cg: Skip restoring properties on cgroup dir net_cls,net_prio/ (00.003534) cg: Determined cgroup dir net_cls,net_prio//docker already exist (00.003538) cg: Skip restoring properties on cgroup dir net_cls,net_prio//docker (00.003795) cg: Determined cgroup dir rdma/ already exist (00.003799) cg: Skip restoring properties on cgroup dir rdma/ (00.004054) cg: Determined cgroup dir cpu,cpuacct/user.slice already exist (00.004058) cg: Skip restoring properties on cgroup dir cpu,cpuacct/user.slice (00.004364) cg: Determined cgroup dir systemd/user.slice/user-1000.slice/session-839.scope already exist (00.004369) cg: Skip restoring properties on cgroup dir systemd/user.slice/user-1000.slice/session-839.scope (00.004407) No mountpoints-5.img image (00.004414) No netns-2.img image (00.004468) Forking task with 101332 pid (flags 0x0) (00.004828) 101332: cg: Move into 2 (00.005337) 101332: Calling restore_sid() for init (00.005355) 101332: Collecting 41/37 (flags 2) (00.005412) 101332: tty: Collected tty ID 0x4 (pts) (00.005434) 101332: Collecting 42/51 (flags 0) (00.005439) 101332: No tty-data.img image (00.005443) 101332: Restoring namespaces 101332 flags 0x0 (00.005456) 101332: Preparing info about shared resources (00.005470) 101332: Collecting 45/38 (flags 0) (00.005475) 101332: No filelocks.img image (00.005482) 101332: Collecting 39/27 (flags 0) (00.005488) 101332: No pipes-data.img image (00.005491) 101332: Collecting 40/27 (flags 0) (00.005495) 101332: No fifo-data.img image (00.005499) 101332: Collecting 38/60 (flags 0) (00.005503) 101332: No sk-queues.img image (00.005540) 101332: vma 0x4f0000 0x4f1000 (00.005544) 101332: vma 0x500000 0x701000 (00.005546) 101332: vma 0x800000 0xa00000 (00.005547) 101332: vma 0xa00000 0xc00000 (00.005549) 101332: vma 0xc00000 0xc05000 (00.005552) 101332: vma 0x28196000 0x281b8000 (00.005554) 101332: vma 0x7fff99ec7000 0x7fff99ec9000 (00.005560) 101332: vma 0x7fff99ec9000 0x7fff99ecb000 (00.005564) 101332: vma 0xffff720a1000 0xffffb40a4000 (00.005567) 101332: vma 0xffffb40a4000 0xffffb4a16000 (00.005569) 101332: vma 0xffffb4a16000 0xffffb529e000 (00.005573) 101332: vma 0xffffc6386000 0xffffc6b86000 (00.005577) 101332: vma 0xffffffffff600000 0xffffffffff601000 (00.005586) 101332: Collect fdinfo pid=101332 fd=0 id=0x4 (00.005592) 101332: Collect fdinfo pid=101332 fd=1 id=0x4 (00.005596) 101332: Collect fdinfo pid=101332 fd=2 id=0x4 (00.005601) 101332: Collect fdinfo pid=101332 fd=3 id=0x6 (00.005605) 101332: Collect fdinfo pid=101332 fd=4 id=0x7 (00.005646) 101332: skqueue: Preparing SCMs (00.005652) 101332: tty: Inherit terminal for id 0x4 (00.005656) 101332: tty: head driver pts id 0x4 index 7 (master 0 sid 54004 pgrp 101332 inherit 1) (00.005663) 101332: File descs: (00.005668) 101332: - type 1 ID 0x1 (00.005669) 101332:- type 1 ID 0x2 (00.005675) 101332: - type 1 ID 0x3 (00.005678) 101332:- type 11 ID 0x4 (00.005679) 101332: - FD 0 pid 101332 (00.005684) 101332:- FD 1 pid 101332 (00.005687) 101332: - FD 2 pid 101332 (00.005689) 101332:- type 1 ID 0x5 (00.005692) 101332: - type 1 ID 0x6 (00.005695) 101332:- FD 3 pid 101332 (00.005699) 101332: - type 1 ID 0x7 (00.005702) 101332:- FD 4 pid 101332 (00.005704) 101332: - type 1 ID 0x8 (00.005709) 101332:- type 1 ID 0x9 (00.005864) 101332: Error (criu/mem.c:990): Trying to restore page for non-private VMA (00.005870) 101332: Error (criu/mem.c:1120): Page entry address ffff720a1000 outside of VMA ffff720a1000-ffffb40a4000 (00.043450) Error (criu/cr-restore.c:1424): 101332 killed by signal 9: Killed (00.043495) Error (criu/cr-restore.c:2300): Restoring FAILED.

avagin commented 5 years ago

@rst0git could you take a look at this?

rst0git commented 5 years ago

I'm not sure how to reproduce the issue. @weiwang999 could you please provide more information about the environment that you are using and the process which is being checkpinted/restored? I'm assuming that the issue is related to https://github.com/checkpoint-restore/criu/issues/687.

adrianreber commented 5 years ago

The git hash in the restore.log does not exist in my git tree. This is the same git id as in #710 Just as in #710 there are following lines in the log:

(00.000255) Collected [popcorn/brdw/popcorn-npb-is_C] ID 0x1
(00.000261) Collected [popcorn/brdw/popcorn-npb-is_C_x86-64] ID 0x2
(00.000267) Collected [popcorn/brdw/popcorn-npb-is_C_aarch64] ID 0x3

Is this somehow related to @rppt's multi-arch migration code?

avagin commented 5 years ago

In CRIU, TASK_SIZE is hardcoded for x86_64:

#define TASK_SIZE       ((1UL << 47) - PAGE_SIZE)

But after recent changes in the kernel, it can have two values:

#ifdef CONFIG_X86_5LEVEL
#define __VIRTUAL_MASK_SHIFT    (pgtable_l5_enabled() ? 56 : 47)
#else
#define __VIRTUAL_MASK_SHIFT    47
#endif

I think we need to update compel_task_size() in CRIU.

rst0git commented 5 years ago

But after recent changes in the kernel, it can have two values:

Thank you for the hint. IIUIC the error occurs when 5-level paging is enabled?

rppt commented 5 years ago

Is this somehow related to @rppt's multi-arch migration code?

It's not mine but judging by the file paths it might be similar work from Virginia Tech.

In case it is, the TASK_SIZE definition differs in x86 in aarch64 and this is likely the cause of the error

avagin commented 5 years ago

TASK_SIZE on arm64 can be 0xfffffffff000, so I think Mike is right.

avagin commented 5 years ago

Who wants to fix the issue what I described on https://github.com/checkpoint-restore/criu/issues/706#issuecomment-509920654?

rst0git commented 5 years ago

Who wants to fix the issue what I described on #706 (comment)?

I will look into that over the weekend.

rst0git commented 5 years ago

5-level paging is available in IA32e mode and by default, when CONFIG_X86_5LEVEL=y is enabled, the kernel will not allocate virtual address space above 47-bit unless the hint address for mmap() explicitly specifies high address.

@0x7f454c46 do you have any ideas on how we could add support for 5-level paging in CRIU?

0x7f454c46 commented 5 years ago

Sure, as per my understanding of this: https://elixir.bootlin.com/linux/latest/source/arch/x86/mm/mmap.c#L194 It should be pretty easy. As CRIU restores mappings with MAP_FIXED, there shouldn't be many 5level-related issues. Probably, it'll be quite invisible to CRIU. So far, it seems that only TASK_SIZE for x86 in CRIU should be corrected. We probably want to:

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.