opencontainers / runc

CLI tool for spawning and running containers according to the OCI specification
https://www.opencontainers.org/
Apache License 2.0
11.79k stars 2.1k forks source link

TestCheckpoint fails on Docker overlay2 #1070

Closed AkihiroSuda closed 6 years ago

AkihiroSuda commented 8 years ago

TestCheckpoint fails consistently on overlay2 but succeeds on overlay. Perhaps the issue of CRIU rather than runc itself?

commit: e83ccf62aac17bf8656b23655638ad4300ea7adc (which uses CRIU v1.7)

$ make test
=== RUN   TestCheckpoint                                                                                                                                                                                                                                                                                           [270/2332]
--- FAIL: TestCheckpoint (0.65s)
        checkpoint_test.go:18: === /tmp/criu961850879/restore.log ===
        checkpoint_test.go:29: (00.000021) Added /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd ext mount mapping
        checkpoint_test.go:29: (00.000044) Added /sys/fs/cgroup/cpu,cpuacct:/sys/fs/cgroup/cpu,cpuacct ext mount mapping
        checkpoint_test.go:29: (00.000047) Added /sys/fs/cgroup/devices:/sys/fs/cgroup/devices ext mount mapping
        checkpoint_test.go:29: (00.000049) Added /sys/fs/cgroup/cpuset:/sys/fs/cgroup/cpuset ext mount mapping
        checkpoint_test.go:29: (00.000050) Added /sys/fs/cgroup/freezer:/sys/fs/cgroup/freezer ext mount mapping
        checkpoint_test.go:29: (00.000052) Added /sys/fs/cgroup/net_cls,net_prio:/sys/fs/cgroup/net_cls,net_prio ext mount mapping
        checkpoint_test.go:29: (00.000053) Added /sys/fs/cgroup/perf_event:/sys/fs/cgroup/perf_event ext mount mapping
        checkpoint_test.go:29: (00.000055) Added /sys/fs/cgroup/hugetlb:/sys/fs/cgroup/hugetlb ext mount mapping
        checkpoint_test.go:29: (00.000056) Added /sys/fs/cgroup/pids:/sys/fs/cgroup/pids ext mount mapping
        checkpoint_test.go:29: (00.000058) Added /sys/fs/cgroup/blkio:/sys/fs/cgroup/blkio ext mount mapping
        checkpoint_test.go:29: (00.000060) Added /sys/fs/cgroup/memory:/sys/fs/cgroup/memory ext mount mapping
        checkpoint_test.go:29: (00.000132) Error (sysctl.c:142): Can't open sysctl net/ipv4/tcp_rmem: No such file or directory
        checkpoint_test.go:29: (00.000137) Warn  (kerndat.c:210): TCP mem sysctls are not available. Using defaults.
        checkpoint_test.go:29: (00.000138) TCP recv queue memory limit is 3145728
        checkpoint_test.go:29: (00.000155) Found task size of 7ffffffff000
        checkpoint_test.go:29: (00.000178) cpu: fpu:1 fxsr:1 xsave:1
        checkpoint_test.go:29: (00.000288) vdso: Parsing at 7fff0e7c1000 7fff0e7c3000
        checkpoint_test.go:29: (00.000293) vdso: PT_LOAD p_vaddr: 0
        checkpoint_test.go:29: (00.000296) vdso: DT_HASH: 0x120
        checkpoint_test.go:29: (00.000298) vdso: DT_STRTAB: 0x2b0
        checkpoint_test.go:29: (00.000300) vdso: DT_SYMTAB: 0x1a8
        checkpoint_test.go:29: (00.000301) vdso: DT_STRSZ: 94
        checkpoint_test.go:29: (00.000302) vdso: DT_SYMENT: 24
        checkpoint_test.go:29: (00.000304) vdso: nbucket 3 nchain 11 bucket 0x7fff0e7c1128 chain 0x7fff0e7c1134
        checkpoint_test.go:29: (00.000309) vdso: rt [vdso] 7fff0e7c1000-7fff0e7c3000 [vvar] 7fff0e7bf000-7fff0e7c1000
        checkpoint_test.go:29: (00.000350) Reading image tree
        checkpoint_test.go:29: (00.000374) Add mnt ns 11 pid 1
        checkpoint_test.go:29: (00.000381) Will restore in 6c020000 namespaces
        checkpoint_test.go:29: (00.000383) NS mask to use 6c020000
        checkpoint_test.go:29: (00.000387) Collecting 37/21 (flags 0)
        checkpoint_test.go:29: (00.000391) No inetsk.img image
        checkpoint_test.go:29: (00.000393)  `- ... done
        checkpoint_test.go:29: (00.000419) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000424) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000426) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000428) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000430) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000432) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000436) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000438) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000439) cg: rewriting docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test to /docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000441) cg: Preparing cgroups yard (cgroups restore mode 0x4)
        checkpoint_test.go:29: (00.000487) cg: Opening .criu.cgyard.DKeIcG as cg yard
        checkpoint_test.go:29: (00.000496) cg:  Making controller dir .criu.cgyard.DKeIcG/cpuset (cpuset)
        checkpoint_test.go:29: (00.000523) cg: Determined cgroup dir cpuset//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000528) cg: Skip restoring properties on cgroup dir cpuset//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000532) cg:  Making controller dir .criu.cgyard.DKeIcG/cpu,cpuacct (cpu,cpuacct)
        checkpoint_test.go:29: (00.000550) cg: Determined cgroup dir cpu,cpuacct/docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000552) cg: Skip restoring properties on cgroup dir cpu,cpuacct/docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000556) cg:  Making controller dir .criu.cgyard.DKeIcG/blkio (blkio)
        checkpoint_test.go:29: (00.000571) cg: Determined cgroup dir blkio//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000573) cg: Skip restoring properties on cgroup dir blkio//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000576) cg:  Making controller dir .criu.cgyard.DKeIcG/memory (memory)
        checkpoint_test.go:29: (00.000592) cg: Determined cgroup dir memory//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000593) cg: Skip restoring properties on cgroup dir memory//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000597) cg:  Making controller dir .criu.cgyard.DKeIcG/devices (devices)
        checkpoint_test.go:29: (00.000614) cg: Determined cgroup dir devices//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000615) cg: Skip restoring properties on cgroup dir devices//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000618) cg:  Making controller dir .criu.cgyard.DKeIcG/freezer (freezer)
        checkpoint_test.go:29: (00.000633) cg: Determined cgroup dir freezer//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000635) cg: Skip restoring properties on cgroup dir freezer//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000638) cg:  Making controller dir .criu.cgyard.DKeIcG/net_cls,net_prio (net_cls,net_prio)
        checkpoint_test.go:29: (00.000655) cg: Determined cgroup dir net_cls,net_prio/docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000657) cg: Skip restoring properties on cgroup dir net_cls,net_prio/docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000660) cg:  Making controller dir .criu.cgyard.DKeIcG/perf_event (perf_event)
        checkpoint_test.go:29: (00.000676) cg: Determined cgroup dir perf_event//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000677) cg: Skip restoring properties on cgroup dir perf_event//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000680) cg:  Making controller dir .criu.cgyard.DKeIcG/hugetlb (hugetlb)
        checkpoint_test.go:29: (00.000695) cg: Determined cgroup dir hugetlb//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000697) cg: Skip restoring properties on cgroup dir hugetlb//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000699) cg:  Making controller dir .criu.cgyard.DKeIcG/pids (pids)
        checkpoint_test.go:29: (00.000715) cg: Determined cgroup dir pids//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000716) cg: Skip restoring properties on cgroup dir pids//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000719) cg:  Making controller dir .criu.cgyard.DKeIcG/systemd (none,name=systemd)
        checkpoint_test.go:29: (00.000741) cg: Determined cgroup dir systemd//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test already exist
        checkpoint_test.go:29: (00.000742) cg: Skip restoring properties on cgroup dir systemd//docker/287c52b33905e523e9688d636303630fbba772c14747147415273c7ce0a8ab06/integration/test
        checkpoint_test.go:29: (00.000784) Warn  (cr-restore.c:1041): Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,because not all kernels support such clone flags combinations!
        checkpoint_test.go:29: (00.000789) Forking task with 1 pid (flags 0x6c028000)
        checkpoint_test.go:29: (00.000801) Saved netns fd for links restore
        checkpoint_test.go:29: (00.001450) PID: real 796 virt 1
        checkpoint_test.go:29: (00.001501) Wait until namespaces are created
        checkpoint_test.go:29: (00.126272) Running setup-namespaces scripts
        checkpoint_test.go:29: (00.126320)      RPC
        checkpoint_test.go:29: (00.126521)      1: Restoring namespaces 1 flags 0x6c028000
        checkpoint_test.go:29: (00.126892)      1: Restoring link lo type 1
        checkpoint_test.go:29: (00.127247)      1:      Running ip addr restore
        checkpoint_test.go:29: (00.128067)      1: Found fd 0 (id pipe:[1581671]) in inherit fd list (caller inherit_fd_resolve_clash)
        checkpoint_test.go:29: (00.128106)      1: Inherit fd 0 moved to 9 to resolve clash
        checkpoint_test.go:29: (00.128111)      1: Found fd 1 (id pipe:[1581674]) in inherit fd list (caller inherit_fd_resolve_clash)
        checkpoint_test.go:29: (00.128112)      1: Inherit fd 1 moved to 7 to resolve clash
        checkpoint_test.go:29: RTNETLINK answers: File exists
        checkpoint_test.go:29: RTNETLINK answers: File exists
        checkpoint_test.go:29: (00.129045)      1:      Running ip route restore
        checkpoint_test.go:29: (00.129635)      1: Found fd 0 (id pipe:[1581671]) in inherit fd list (caller inherit_fd_resolve_clash)
        checkpoint_test.go:29: (00.129661)      1: Inherit fd 0 moved to 9 to resolve clash
        checkpoint_test.go:29: (00.129664)      1: Found fd 1 (id pipe:[1581674]) in inherit fd list (caller inherit_fd_resolve_clash)
        checkpoint_test.go:29: (00.129666)      1: Inherit fd 1 moved to 7 to resolve clash
        checkpoint_test.go:29: (00.130275)      1:      Running iptables-restore for iptables-restore
        checkpoint_test.go:29: (00.130799)      1: Found fd 0 (id pipe:[1581671]) in inherit fd list (caller inherit_fd_resolve_clash)
        checkpoint_test.go:29: (00.130823)      1: Inherit fd 0 moved to 9 to resolve clash
        checkpoint_test.go:29: (00.130827)      1: Found fd 1 (id pipe:[1581674]) in inherit fd list (caller inherit_fd_resolve_clash)
        checkpoint_test.go:29: (00.130828)      1: Inherit fd 1 moved to 7 to resolve clash
        checkpoint_test.go:29: (00.132467)      1: kernel/hostname nr 11
        checkpoint_test.go:29: (00.132509)      1: kernel/domainname nr 6
        checkpoint_test.go:29: (00.132522)      1: Restoring IPC namespace
        checkpoint_test.go:29: (00.132523)      1: Restoring IPC variables
        checkpoint_test.go:29: (00.132585)      1: Restoring IPC shared memory
        checkpoint_test.go:29: (00.132589)      1: No ipcns-shm-9.img image
        checkpoint_test.go:29: (00.132592)      1: Restoring IPC message queues
        checkpoint_test.go:29: (00.132594)      1: No ipcns-msg-9.img image
        checkpoint_test.go:29: (00.132596)      1: Restoring IPC semaphores sets
        checkpoint_test.go:29: (00.132599)      1: No ipcns-sem-9.img image
        checkpoint_test.go:29: (00.132601)      1: Restoring mount namespace
        checkpoint_test.go:29: (00.132670)      1:      type overlay source overlay mnt_id 100 s_dev 0x29 / @ ./ flags 0x200000 options lowerdir=/var/lib/docker/overlay2/l/JLFPDJKBC
F4FZ7G6M6KVBJATAO:/var/lib/docker/overlay2/l/AGMS6NBMF647JIGVTDTVESGSZO:/var/lib/docker/overlay2/l/EDCXHUBMHMPP4ODQROJOGZJUS6:/var/lib/docker/overlay2/l/CRMFDAWMVYQ5XQCXBJB5QO4N4P:/
var/lib/docker/overlay2/l/GT2SRI4NJZD22NN7R3TJL7HTO3:/var/lib/docker/overlay2/l/6E63UUMHFFQL3ODTCM2QIQVSM2:/var/lib/docker/overlay2/l/WJB5UV4KUUS7IJ7MHSB2ZTSVDU:/var/lib/docker/over
lay2/l/BDAMM7USBF6H53S24J5F7LGSTK:/var/lib/docker/overlay2/l/3Z3ZBLVUY364LQVWMQMIWRC6JQ:/var/lib/docker/overlay2/l/XPW4GMBMLESZ6DRJT4JRA5JHYC:/var/lib/docker/overlay2/l/PC5OV3AZFF7C
JS32RXGYNVBOOM:/var/lib/docker/overlay2/l/UKKFC4FJY7AALTUP3TG64WVPJI:/var/lib/docker/overlay2/l/7Y6KBXXHZL7EZAUZKWGWDIAYQ3:/var/lib/docker/overlay2/l/RTBY4JO3CPW6UDCE264GOGZCA5:/var
/lib/docker/overlay2/l/WYX5QI4WD24FPI5MNZIFYKGJSO:/var/lib/docker/overlay2/l/LESREZHS3MD6ZAM6P3MMWSVUTX,upperdir=/var/lib/docker/overlay2/145c1847f375b96990b4e7410a5c9bc4be5487e613c
47aff847198284ac8dab5/di
        checkpoint_test.go:29: (00.132712)      1: Error (proc_parse.c:1151): Bad format in 0 mountinfo: 'ff,workdir=/var/lib/docker/overlay2/145c1847f375b96990b4e7410a5c9bc4be5487e
613c47aff847198284ac8dab5/work
        checkpoint_test.go:29: '
        checkpoint_test.go:29: (00.132721)      1: Error (mount.c:1642): Can't parse 0's mountinfo
        checkpoint_test.go:29: (00.155413) Error (cr-restore.c:1933): Restoring FAILED.
        checkpoint_test.go:36: === END ===
        checkpoint_test.go:164: criu failed: type NOTIFY errno 0
                log file: /tmp/criu961850879/restore.log        
$ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 12
Server Version: 1.13.0-dev
Storage Driver: overlay2
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-38-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 29.45 GiB
Name: ws01
ID: SN3T:CNK6:JKQD:54CY:XKF2:BRX3:CIRU:DQBT:6DVZ:VWQ2:Q5ET:F23L
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 14
 Goroutines: 22
 System Time: 2016-09-23T08:05:34.327078318Z
 EventsListeners: 0
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
$ docker version
Client:
 Version:      1.13.0-dev
 API version:  1.25
 Go version:   go1.7.1
 Git commit:   d3139fc
 Built:        Fri Sep 23 06:42:25 2016
 OS/Arch:      linux/amd64
 Experimental: true

Server:
 Version:      1.13.0-dev
 API version:  1.25
 Go version:   go1.7.1
 Git commit:   d3139fc
 Built:        Fri Sep 23 06:42:25 2016
 OS/Arch:      linux/amd64
 Experimental: true
cyphar commented 8 years ago

/cc @avagin

marcosnils commented 8 years ago

maybe it's better to report it it in the CRIU mailing list?

marcosnils commented 8 years ago

@AkihiroSuda seems like restoring actually failed, can you please share the restore.log file?

AkihiroSuda commented 8 years ago

Here: dump.log.txt restore.log.txt

avagin commented 7 years ago

We didn't expect that lines in mountinfo can be longer that 1024...

AkihiroSuda commented 6 years ago

Seems closable now