checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.99k stars 599 forks source link

If the process happens to codedump when criu executes the dump #2328

Open hdzhoujie opened 10 months ago

hdzhoujie commented 10 months ago

If the process happens to codedump when criu executes the dump, will a normal dump succeed? After criu executes the restore successfully, will the process generate a normal coredump?

adrianreber commented 10 months ago

Not sure what you mean. Can you share more details?

What is a normal dump?

hdzhoujie commented 10 months ago

After criu executes the restore successfully, the process will immediately coredump.

The stack is as follows:

Call Trace: dump_stack+0x6f/0xab dump_header+0x54/0x300 oom_kill_process+0xd1/0x100 out_of_memory+0x11c/0x550 mem_cgroup_out_of_memory+0xb5/0xd0 try_charge+0x720/0x770 mem_cgroup_try_charge+0x86/0x180 mem_cgroup_try_charge_delay+0x1c/0x40 shmem_getpage_gfp+0x1cc/0xbc0 shmem_write_begin+0x35/0x60 generic_perform_write+0xb6/0x1b0 generic_file_write_iter+0x192/0x1c0 generic_file_write_iter+0xec/0x160 new_sync_write+0x124/0x170 kernel_write+0x4f/0xf0 dump_emit+0x6c/0xc0 elf_core_dump+0x9c6/0xb9a do_coredump+0x89a/0x1170 ? copyin+0x20/0x30 ? generic_perform_write+0x12b/0x1b0 get_signal+0x155/0x850 do_signal+0x36/0x610 ? recalc_sigpending+0x17/0x50 ? recalc_sigpending+0x17/0x50 exit_to_usermode_loop+0x76/0xe0 do_syscall_64+0x18c/0x1d0 entry_SYSCALL_64_after_hwframe+0x65/0xca RIP: 0033:0x7f06f4d7d81b Code: da b8 ea 00 00 00 0f 05 48 3d 00 f0 ff ff 77 3f 41 89 c0 41 ba 08 00 00 00 31 d2 4c 89 ce bf 02 00 00 00 b8 0e 00 00 00 0f 05 <48> 8b 8c 24 08 01 00 00 64 48 33 0c 25 28 00 00 00 44 89 c0 75 1d RSP: 002b:00007f06f48a34c0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000006 RCX: 00007f06f4d7d81b RDX: 0000000000000000 RSI: 00007f06f48a34c0 RDI: 0000000000000002 RBP: 00007f06f48a3840 R08: 0000000000000000 R09: 00007f06f48a34c0 R10: 0000000000000008 R11: 0000000000000246 R12: 00007f06f48a3730 R13: 000000000000001d R14: 00007f06f48a3730 R15: 0000000000000002 memory: usage 8388608kB, limit 8388608kB, failcnt 1784 memory+swap: usage 8388608kB, limit 9007199254740988kB, failcnt 0 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /system.slice/system-daemon1.slice: cache:3411232KB rss:4977032KB rss_huge:4975840KB shmem:3411408KB mapped_file:14256KB dirty:0KB writeback:0KB swap:0KB inactive_anon:3411144KB active_anon:4977272KB inactive_file:0KB active_file:0KB unevictable:0KB Memory cgroup stats for /system.slice/system-daemon1.slice/evsd.service: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [80309] 10000 80309 4156142 1249338 10366976 0 0 evs_daemon oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=system-daemon1.slice,mems_allowed=0,oom_memcg=/system.slice/system-daemon1.slice,task_memcg=/system.slice/system-daemon1.slice,task=daemon,pid=80309,uid=10000

adrianreber commented 10 months ago

Looks like your process is killed because you are out of memory. Are you restoring on the same system? It seems the cgroup has not enough memory.

github-actions[bot] commented 9 months ago

A friendly reminder that this issue had no activity for 30 days.