checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.91k stars 583 forks source link

If the process happens to codedump when criu executes the dump #2328

Open hdzhoujie opened 8 months ago

hdzhoujie commented 8 months ago

If the process happens to codedump when criu executes the dump, will a normal dump succeed? After criu executes the restore successfully, will the process generate a normal coredump?

adrianreber commented 8 months ago

Not sure what you mean. Can you share more details?

What is a normal dump?

hdzhoujie commented 8 months ago

After criu executes the restore successfully, the process will immediately coredump.

The stack is as follows:

Call Trace: dump_stack+0x6f/0xab dump_header+0x54/0x300 oom_kill_process+0xd1/0x100 out_of_memory+0x11c/0x550 mem_cgroup_out_of_memory+0xb5/0xd0 try_charge+0x720/0x770 mem_cgroup_try_charge+0x86/0x180 mem_cgroup_try_charge_delay+0x1c/0x40 shmem_getpage_gfp+0x1cc/0xbc0 shmem_write_begin+0x35/0x60 generic_perform_write+0xb6/0x1b0 generic_file_write_iter+0x192/0x1c0 generic_file_write_iter+0xec/0x160 new_sync_write+0x124/0x170 kernel_write+0x4f/0xf0 dump_emit+0x6c/0xc0 elf_core_dump+0x9c6/0xb9a do_coredump+0x89a/0x1170 ? copyin+0x20/0x30 ? generic_perform_write+0x12b/0x1b0 get_signal+0x155/0x850 do_signal+0x36/0x610 ? recalc_sigpending+0x17/0x50 ? recalc_sigpending+0x17/0x50 exit_to_usermode_loop+0x76/0xe0 do_syscall_64+0x18c/0x1d0 entry_SYSCALL_64_after_hwframe+0x65/0xca RIP: 0033:0x7f06f4d7d81b Code: da b8 ea 00 00 00 0f 05 48 3d 00 f0 ff ff 77 3f 41 89 c0 41 ba 08 00 00 00 31 d2 4c 89 ce bf 02 00 00 00 b8 0e 00 00 00 0f 05 <48> 8b 8c 24 08 01 00 00 64 48 33 0c 25 28 00 00 00 44 89 c0 75 1d RSP: 002b:00007f06f48a34c0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000006 RCX: 00007f06f4d7d81b RDX: 0000000000000000 RSI: 00007f06f48a34c0 RDI: 0000000000000002 RBP: 00007f06f48a3840 R08: 0000000000000000 R09: 00007f06f48a34c0 R10: 0000000000000008 R11: 0000000000000246 R12: 00007f06f48a3730 R13: 000000000000001d R14: 00007f06f48a3730 R15: 0000000000000002 memory: usage 8388608kB, limit 8388608kB, failcnt 1784 memory+swap: usage 8388608kB, limit 9007199254740988kB, failcnt 0 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /system.slice/system-daemon1.slice: cache:3411232KB rss:4977032KB rss_huge:4975840KB shmem:3411408KB mapped_file:14256KB dirty:0KB writeback:0KB swap:0KB inactive_anon:3411144KB active_anon:4977272KB inactive_file:0KB active_file:0KB unevictable:0KB Memory cgroup stats for /system.slice/system-daemon1.slice/evsd.service: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [80309] 10000 80309 4156142 1249338 10366976 0 0 evs_daemon oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=system-daemon1.slice,mems_allowed=0,oom_memcg=/system.slice/system-daemon1.slice,task_memcg=/system.slice/system-daemon1.slice,task=daemon,pid=80309,uid=10000

adrianreber commented 8 months ago

Looks like your process is killed because you are out of memory. Are you restoring on the same system? It seems the cgroup has not enough memory.

github-actions[bot] commented 7 months ago

A friendly reminder that this issue had no activity for 30 days.