dotmesh-io / dotmesh

dotmesh (dm) is like git for your data volumes (databases, files etc) in Docker and Kubernetes
https://dotmesh.com
Apache License 2.0
539 stars 29 forks source link

Common test flake: `oci runtime error: container_linux.go:247` #494

Open alaric-dotmesh opened 6 years ago

alaric-dotmesh commented 6 years ago

A kubernetes test fails, and dexing into the cluster on gitlab-runner shows:

root@cluster-1530916376239095207-0-node-0:/# kubectl get po -n dotmesh
NAME                                           READY     STATUS               RESTARTS   AGE
dotmesh-dynamic-provisioner-78c8c759bd-gkr9r   1/1       Running              0          12m
dotmesh-etcd-cluster-2nsdtf5s5r                0/1       ContainerCannotRun   0          12m
dotmesh-operator-58f97f7975-6fmv8              1/1       Running              0          12m
etcd-operator-86b7856bdd-9swfd                 1/1       Running              0          13m
server-cluster-1530916376239095207-0-node-0    0/1       ContainerCreating    0          12m
server-cluster-1530916376239095207-0-node-1    0/1       ContainerCreating    0          12m
server-cluster-1530916376239095207-0-node-2    0/1       ContainerCreating    0          12m

root@cluster-1530916376239095207-0-node-0:/# kubectl get pod -o yaml -n dotmesh dotmesh-etcd-cluster-2nsdtf5s5r
[...]
        message: 'invalid header field value "oci runtime error: container_linux.go:247:
          starting container process caused \"process_linux.go:359: container init
          caused \\\"rootfs_linux.go:53: mounting \\\\\\\"cgroup\\\\\\\" to rootfs
          \\\\\\\"/dind/docker/overlay2/030e04e0fcfbf2d3faaf711166ec6daf8f387afbe98c40b4da34113dc1a8076a/merged\\\\\\\"
          at \\\\\\\"/sys/fs/cgroup\\\\\\\" caused \\\\\\\"stat /dind/docker/overlay2/3da1f30068ded099a0829de18146af94536fa3a598bcc3fd5fc112de7dcaa4fc/merged/sys/fs/kubepods/besteffort/podee65dbf5-816c-11e8-84e3-024285939eb9/87aebe5a30e5e60aa936de686924ef62d2fdcdd74cb6f23de5285d312a0ad660:
          no such file or directory\\\\\\\"\\\"\"\n"'
[...]

...and similar errors for the other pods.

Looking in the dind container filesystem, /dind/docker/overlay2 exists and contains many things, but doesn't contain 3da1f30068ded....

Retrying the test doesn't help. What gives?

prisamuel commented 6 years ago

Seeing this issue on the CI as well http://gitlab.dotmesh.io:9999/dotmesh/dotmesh-sync/-/jobs/86533/raw