Closed mcastelino closed 3 years ago
When running a simple workload such as
apiVersion: v1 kind: Pod metadata: name: guar-2kc spec: runtimeClassName: kata-qemu containers: - name: busybee image: busybox resources: limits: cpu: 2 memory: "400Mi" command: ["md5sum"] args: ["/dev/urandom"] - name: busybum image: busybox resources: limits: cpu: 3 memory: "200Mi" command: ["md5sum"] args: ["/dev/urandom"]
we find that tasks setup is incorrect.
kata$for i in `ls pod*/**/tasks`; do echo $i && for j in `cat $i`; do ps auxw | grep $j;done; done; pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/tasks root 13041 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24985 root 24985 0.0 0.2 1004980 19352 ? Sl 21:13 0:00 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -container 2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956 -exec-id 2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956 root 13043 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24986 root 13045 0.0 0.0 6360 976 pts/0 S+ 21:40 0:00 grep 24987 root 13047 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24988 root 13049 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24989 root 13051 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24990 root 13053 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24992 root 13055 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24993 root 13057 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 24994 root 13059 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24995 root 13061 0.0 0.0 6360 980 pts/0 S+ 21:40 0:00 grep 24996 root 13063 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24997 root 13065 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24998 pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/tasks root 13068 0.0 0.0 6360 920 pts/0 S+ 21:40 0:00 grep 25183 root 25183 0.0 0.2 858668 21992 ? Sl 21:13 0:00 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -container 5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be -exec-id 5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be root 13070 0.0 0.0 6360 972 pts/0 S+ 21:40 0:00 grep 25185 root 13072 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 25186 root 13074 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 25187 root 13076 0.0 0.0 6360 904 pts/0 S+ 21:40 0:00 grep 25188 root 13078 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 25189 root 13080 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 25190 root 13082 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 25191 root 13084 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 25192 root 13086 0.0 0.0 6360 920 pts/0 S+ 21:40 0:00 grep 25193 root 13088 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 25194 pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/tasks root 13091 0.0 0.0 6360 976 pts/0 S+ 21:40 0:00 grep 24964 root 24964 0.0 0.0 78328 2008 ? Ssl 21:13 0:00 /usr/libexec/crio/conmon --syslog -c 2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956 -u 2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956 -r /opt/kata/bin/kata-qemu -b /var/run/containers/storage/overlay-containers/2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/userdata -p /var/run/containers/storage/overlay-containers/2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/userdata/pidfile -l /var/log/pods/default_guar-2kc_5884dc6c-5b0c-11e9-90bc-525400cfa589/busybee/0.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error root 13093 0.0 0.0 6360 900 pts/0 S+ 21:40 0:00 grep 24966 pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/tasks root 13096 0.0 0.0 6360 980 pts/0 S+ 21:40 0:00 grep 25090 root 25090 0.0 0.0 78328 2008 ? Ssl 21:13 0:00 /usr/libexec/crio/conmon --syslog -c 5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be -u 5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be -r /opt/kata/bin/kata-qemu -b /var/run/containers/storage/overlay-containers/5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/userdata -p /var/run/containers/storage/overlay-containers/5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/userdata/pidfile -l /var/log/pods/default_guar-2kc_5884dc6c-5b0c-11e9-90bc-525400cfa589/busybum/0.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error root 13098 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 25092 pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/tasks root 2846 0.0 0.0 78328 2020 ? Ssl 21:06 0:00 /usr/libexec/crio/conmon --syslog -c 0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35 -u 0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35 -r /usr/bin/runc -b /var/run/containers/storage/overlay-containers/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35/userdata -p /var/run/containers/storage/overlay-containers/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35/userdata/pidfile -l /var/log/pods/kube-system_etcd-clr-01_af3e4a507ec0af8c2233ee5bf0783073/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error root 3034 0.0 0.0 78328 2020 ? Ssl 21:06 0:00 /usr/libexec/crio/conmon --syslog -c 96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce -u 96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce -r /usr/bin/runc -b /var/run/containers/storage/overlay-containers/96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce/userdata -p /var/run/containers/storage/overlay-containers/96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce/userdata/pidfile -l /var/log/pods/kube-system_etcd-clr-01_af3e4a507ec0af8c2233ee5bf0783073/etcd/0.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error root 13101 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 7830 root 13103 0.0 0.0 6360 920 pts/0 S+ 21:40 0:00 grep 19505 root 13105 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24584 root 24584 0.0 0.0 78328 172 ? Ssl 21:13 0:00 /usr/libexec/crio/conmon --syslog -c f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -u f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -r /opt/kata/bin/kata-qemu -b /var/run/containers/storage/overlay-containers/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/userdata -p /var/run/containers/storage/overlay-containers/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/userdata/pidfile -l /var/log/pods/default_guar-2kc_5884dc6c-5b0c-11e9-90bc-525400cfa589/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error root 13107 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24586 root 13109 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24602 root 24602 100 2.7 3590552 226136 ? Sl 21:13 26:33 /opt/kata/bin/qemu-system-x86_64 -name sandbox-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -uuid ada3582a-9766-4030-82e7-95427d95ad17 -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host,pmu=off -qmp unix:/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/qmp.sock,server,nowait -m 2048M,slots=10,maxmem=8992M -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= -device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/opt/kata/share/kata-containers/kata-containers-image_clearlinux_1.6.1_agent_992b4987a32.img,size=134217728 -device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng,rng=rng0,romfile= -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev socket,id=charch0,path=/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/kata.sock,server,nowait -device virtio-9p-pci,disable-modern=true,fsdev=extra-9p-kataShared,mount_tag=kataShared,romfile= -fsdev local,id=extra-9p-kataShared,path=/run/kata-containers/shared/sandboxes/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a,security_model=none -netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 -device driver=virtio-net-pci,netdev=network-0,mac=b2:78:0b:80:8b:a2,disable-modern=true,mq=on,vectors=4,romfile= -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -kernel /opt/kata/share/kata-containers/vmlinuz-4.19.28-31 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=8 init=/usr/lib/systemd/systemd systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket systemd.mask=systemd-journald.service systemd.mask=systemd-journald.socket systemd.mask=systemd-journal-flush.service systemd.mask=systemd-udevd.service systemd.mask=systemd-udevd.socket systemd.mask=systemd-udev-trigger.service systemd.mask=systemd-timesyncd.service systemd.mask=systemd-update-utmp.service systemd.mask=systemd-tmpfiles-setup.service systemd.mask=systemd-tmpfiles-cleanup.service systemd.mask=systemd-tmpfiles-cleanup.timer systemd.mask=tmp.mount -pidfile /run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/pid -smp 1,cores=1,threads=1,sockets=1,maxcpus=8 root 24604 0.0 0.0 0 0 ? S 21:13 0:00 [vhost-24602] root 24606 0.0 0.0 0 0 ? S 21:13 0:00 [kvm-pit/24602] root 13111 0.0 0.0 6360 920 pts/0 S+ 21:40 0:00 grep 24603 root 13113 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24604 root 24604 0.0 0.0 0 0 ? S 21:13 0:00 [vhost-24602] root 13115 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 24607 root 24607 0.0 0.1 1215688 15420 ? Sl 21:13 0:01 /opt/kata/libexec/kata-containers/kata-proxy -listen-socket unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -mux-socket /run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/kata.sock -sandbox f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a root 13117 0.0 0.0 6360 904 pts/0 S+ 21:40 0:00 grep 24608 root 13119 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24609 root 13121 0.0 0.0 6360 968 pts/0 S+ 21:40 0:00 grep 24610 root 13123 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24611 root 13125 0.0 0.0 6360 984 pts/0 S+ 21:40 0:00 grep 24612 root 13127 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 24613 root 13129 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 24614 root 13131 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24615 root 13133 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24616 root 13135 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 25025 root 13137 0.0 0.0 6360 908 pts/0 S+ 21:40 0:00 grep 27555 root 13139 0.0 0.0 6360 920 pts/0 S+ 21:40 0:00 grep 27556 root 13141 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 31294 pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/tasks root 13144 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24605 root 13146 0.0 0.0 6360 984 pts/0 S+ 21:40 0:00 grep 24639 root 24639 0.0 0.2 858668 22160 ? Sl 21:13 0:00 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -container f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -exec-id f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a root 13148 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24641 root 13150 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24642 root 13152 0.0 0.0 6360 972 pts/0 S+ 21:40 0:00 grep 24644 root 13154 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24645 root 13156 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24646 root 13158 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 24648 root 13160 0.0 0.0 6360 920 pts/0 S+ 21:40 0:00 grep 24649 root 13162 0.0 0.0 6360 976 pts/0 S+ 21:40 0:00 grep 24650 root 13164 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 24651 root 13166 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 24652 root 13168 0.0 0.0 6360 856 pts/0 S+ 21:40 0:00 grep 24979 root 13170 0.0 0.0 6360 920 pts/0 S+ 21:40 0:00 grep 24980 root 13172 0.0 0.0 6360 916 pts/0 S+ 21:40 0:00 grep 25105 root 13174 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 25106 root 13176 0.0 0.0 6360 852 pts/0 S+ 21:40 0:00 grep 25107 pod5884dc6c-5b0c-11e9-90bc-525400cfa589/tasks
For more gory detail https://gist.github.com/mcastelino/e975cd26958554b4c46c7168067b66b0
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"archive", BuildDate:"2019-03-29T16:29:07Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Version: 0.1.0 RuntimeName: cri-o RuntimeVersion: 1.13.1 RuntimeApiVersion: v1alpha1
kata-runtime : 1.6.1 commit : 8efc5718813224722f87ad119edcf9753fd6147d OCI specs: 1.0.1-dev
/cc @devimc @jcvenegas @egernst @bergwolf
I'll try to reproduce locally, probably the runtime is spawn in this cgroup, let me confirm
cgroups: Incorrect cgroup setup with crio
When running a simple workload such as
we find that tasks setup is incorrect.
For more gory detail https://gist.github.com/mcastelino/e975cd26958554b4c46c7168067b66b0
Environment