Closed mehdisadeghi closed 5 years ago
This issue may be related:
https://github.com/docker/for-linux/issues/597
Some people are reporting there issues when running docker in Ubuntu 18.04 but apparently that can be fixed after upgrading/restarting. Can you check if that helps you with your issue?
@andresmgot Thanks for your reply.
Apaprently microk8s snap on Ubuntu 18.04 does not use docker and uses containerd directly. There is no docker daemon running on my host.
This is what I get every few minutes in the system logs:
Jun 24 10:20:40 k1 microk8s.daemon-controller-manager[14622]: I0624 10:20:40.108274 14622 event.go:258] Event(v1.ObjectReference{Kind:"Job", Namespace:"default", Name:"trigger-f1-1561364280", UID:"e3a91fe1-ef0e-45dc-8317-5cdcc8c358ad", APIVersion:"batch/v1", ResourceVersion:"1050", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: trigger-f1-1561364280-vpnmh
Jun 24 10:20:40 k1 microk8s.daemon-controller-manager[14622]: E0624 10:20:40.120785 14622 job_controller.go:397] Error syncing job: failed pod(s) detected for job key "default/trigger-f1-1561364280"
Jun 24 10:20:40 k1 microk8s.daemon-kubelet[14529]: I0624 10:20:40.197006 14529 reconciler.go:203] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-vmhxs" (UniqueName: "kubernetes.io/secret/65c2648d-885a-44aa-9aff-310537b8d391-default-token-vmhxs") pod "trigger-f1-1561364280-vpnmh" (UID: "65c2648d-885a-44aa-9aff-310537b8d391")
Jun 24 10:20:40 k1 systemd[1]: Started Kubernetes transient mount for /var/snap/microk8s/common/var/lib/kubelet/pods/65c2648d-885a-44aa-9aff-310537b8d391/volumes/kubernetes.io~secret/default-token-vmhxs.
Jun 24 10:20:40 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:40.424585656+02:00" level=info msg="RunPodsandbox for &PodSandboxMetadata{Name:trigger-f1-1561364280-vpnmh,Uid:65c2648d-885a-44aa-9aff-310537b8d391,Namespace:default,Attempt:0,}"
Jun 24 10:20:40 k1 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Jun 24 10:20:40 k1 systemd-udevd[21815]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jun 24 10:20:40 k1 systemd-udevd[21815]: Could not generate persistent MAC address for vethdaf3a14b: No such file or directory
Jun 24 10:20:40 k1 kernel: IPv6: ADDRCONF(NETDEV_UP): vethdaf3a14b: link is not ready
Jun 24 10:20:40 k1 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethdaf3a14b: link becomes ready
Jun 24 10:20:40 k1 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Jun 24 10:20:40 k1 kernel: cbr0: port 3(vethdaf3a14b) entered blocking state
Jun 24 10:20:40 k1 kernel: cbr0: port 3(vethdaf3a14b) entered disabled state
Jun 24 10:20:40 k1 kernel: device vethdaf3a14b entered promiscuous mode
Jun 24 10:20:40 k1 kernel: cbr0: port 3(vethdaf3a14b) entered blocking state
Jun 24 10:20:40 k1 kernel: cbr0: port 3(vethdaf3a14b) entered forwarding state
Jun 24 10:20:40 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:40.694784259+02:00" level=info msg="shim containerd-shim started" address="/containerd-shim/k8s.io/06762d92d62e127c82ec8a15680d438e9a483471ac4ecb1a340684423aa4dc94/shim.sock" debug=true pid=21835
Jun 24 10:20:40 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:40+02:00" level=debug msg="registering ttrpc server"
Jun 24 10:20:40 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:40+02:00" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Jun 24 10:20:42 k1 sshd[21716]: Failed password for root from 112.85.42.171 port 46357 ssh2
Jun 24 10:20:42 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:42.782408207+02:00" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:trigger-f1-1561364280-vpnmh,Uid:65c2648d-885a-44aa-9aff-310537b8d391,Namespace:default,Attempt:0,} returns sandbox id "06762d92d62e127c82ec8a15680d438e9a483471ac4ecb1a340684423aa4dc94""
Jun 24 10:20:42 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:42.793869260+02:00" level=info msg="CreateContainer within sandbox "06762d92d62e127c82ec8a15680d438e9a483471ac4ecb1a340684423aa4dc94" for container &ContainerMetadata{Name:trigger,Attempt:0,}"
Jun 24 10:20:42 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:42.870397350+02:00" level=info msg="CreateContainer within sandbox "06762d92d62e127c82ec8a15680d438e9a483471ac4ecb1a340684423aa4dc94" for &ContainerMetadata{Name:trigger,Attempt:0,} returns container id "54eba74f19fe9ccecc926dabfe55cf48d4e1004795234441c8a84032f1c7d46b""
Jun 24 10:20:42 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:42.871662847+02:00" level=info msg="StartContainer for "54eba74f19fe9ccecc926dabfe55cf48d4e1004795234441c8a84032f1c7d46b""
Jun 24 10:20:42 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:42.874014187+02:00" level=info msg="shim containerd-shim started" address="/containerd-shim/k8s.io/54eba74f19fe9ccecc926dabfe55cf48d4e1004795234441c8a84032f1c7d46b/shim.sock" debug=true pid=21937
Jun 24 10:20:42 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:42+02:00" level=debug msg="registering ttrpc server"
Jun 24 10:20:42 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:42+02:00" level=debug msg="serving api on unix socket" socket="[inherited from parent]"
Jun 24 10:20:44 k1 microk8s.daemon-containerd[14509]: time="2019-06-24T10:20:44.968381513+02:00" level=info msg="TaskOOM event &TaskOOM{ContainerID:06762d92d62e127c82ec8a15680d438e9a483471ac4ecb1a340684423aa4dc94,}"
Jun 24 10:20:44 k1 kernel: runc:[2:INIT] invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=-998
Jun 24 10:20:44 k1 kernel: runc:[2:INIT] cpuset=54eba74f19fe9ccecc926dabfe55cf48d4e1004795234441c8a84032f1c7d46b mems_allowed=0
Jun 24 10:20:44 k1 kernel: CPU: 1 PID: 21967 Comm: runc:[2:INIT] Not tainted 4.15.0-50-generic #54-Ubuntu
Jun 24 10:20:44 k1 kernel: Hardware name: Hetzner vServer, BIOS 20171111 11/11/2017
Jun 24 10:20:44 k1 kernel: Call Trace:
Jun 24 10:20:44 k1 kernel: dump_stack+0x63/0x8b
Jun 24 10:20:44 k1 kernel: dump_header+0x71/0x285
Jun 24 10:20:44 k1 kernel: oom_kill_process+0x220/0x440
Jun 24 10:20:44 k1 kernel: out_of_memory+0x2d1/0x4f0
Jun 24 10:20:44 k1 kernel: mem_cgroup_out_of_memory+0x4b/0x80
Jun 24 10:20:44 k1 kernel: mem_cgroup_oom_synchronize+0x2e8/0x320
Jun 24 10:20:44 k1 kernel: ? mem_cgroup_css_online+0x40/0x40
Jun 24 10:20:44 k1 kernel: pagefault_out_of_memory+0x36/0x7b
Jun 24 10:20:44 k1 kernel: mm_fault_error+0x90/0x180
Jun 24 10:20:44 k1 kernel: __do_page_fault+0x4a5/0x4d0
Jun 24 10:20:44 k1 kernel: do_page_fault+0x2e/0xe0
Jun 24 10:20:44 k1 kernel: ? async_page_fault+0x2f/0x50
Jun 24 10:20:44 k1 kernel: do_async_page_fault+0x51/0x80
Jun 24 10:20:44 k1 kernel: async_page_fault+0x45/0x50
Jun 24 10:20:44 k1 kernel: RIP: 0033:0x560f61a7721f
Jun 24 10:20:44 k1 kernel: RSP: 002b:000000c420145f78 EFLAGS: 00010206
Jun 24 10:20:44 k1 kernel: RAX: 000000c420145ff8 RBX: 000000c4200ac988 RCX: 000000c420000180
Jun 24 10:20:44 k1 kernel: RDX: 0000000000000000 RSI: 0000560f61f38440 RDI: 000000c420146388
Jun 24 10:20:44 k1 kernel: RBP: 000000c4201469f0 R08: 0000000000000000 R09: 0000000000000000
Jun 24 10:20:44 k1 kernel: R10: 0000560f61c91eb6 R11: 0000000000000010 R12: 0000000000000000
Jun 24 10:20:44 k1 kernel: R13: 00000000000000f1 R14: 0000000000000011 R15: 0000000000000000
Jun 24 10:20:44 k1 kernel: Task in /kubepods/pod65c2648d-885a-44aa-9aff-310537b8d391/54eba74f19fe9ccecc926dabfe55cf48d4e1004795234441c8a84032f1c7d46b killed as a result of limit of /kubepods/pod65c2648d-885a-44aa-9aff-310537b8d391
Jun 24 10:20:44 k1 kernel: memory: usage 4096kB, limit 4096kB, failcnt 43
Jun 24 10:20:44 k1 kernel: memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
Jun 24 10:20:44 k1 kernel: kmem: usage 844kB, limit 9007199254740988kB, failcnt 0
Jun 24 10:20:44 k1 kernel: Memory cgroup stats for /kubepods/pod65c2648d-885a-44aa-9aff-310537b8d391: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Jun 24 10:20:44 k1 kernel: Memory cgroup stats for /kubepods/pod65c2648d-885a-44aa-9aff-310537b8d391/06762d92d62e127c82ec8a15680d438e9a483471ac4ecb1a340684423aa4dc94: cache:0KB rss:44KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB active_file:0KB unevictable:0KB
Jun 24 10:20:44 k1 kernel: Memory cgroup stats for /kubepods/pod65c2648d-885a-44aa-9aff-310537b8d391/54eba74f19fe9ccecc926dabfe55cf48d4e1004795234441c8a84032f1c7d46b: cache:0KB rss:3208KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:3208KB inactive_file:0KB active_file:0KB unevictable:0KB
Jun 24 10:20:44 k1 kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Jun 24 10:20:44 k1 kernel: [21863] 0 21863 256 1 32768 0 -998 pause
Jun 24 10:20:44 k1 kernel: [21967] 0 21967 15559 1732 126976 0 -998 runc:[2:INIT]
Jun 24 10:20:44 k1 kernel: Memory cgroup out of memory: Kill process 21967 (runc:[2:INIT]) score 696 or sacrifice child
Jun 24 10:20:44 k1 kernel: Killed process 21967 (runc:[2:INIT]) total-vm:62236kB, anon-rss:3032kB, file-rss:3896kB, shmem-rss:0kB
It seems that the container is being killed because an OOM issue (which is weird because that container is just doing a curl
request).
https://stackoverflow.com/questions/55304976/oom-killer-docker
It's a long shot but this may be related to the fact that the container is running as root. To prove/discard that can you try to create a similar pod but specifying a securityContext
and changing the running UID? Something like:
securityContext:
runAsUser: 1001
@andresmgot would you mind describing where should I add the yaml snippet? I'm totally new to k8s and kubeless!
Moreover, in a new vm, I installed everything with a normal user and finally I got the same error. btw, can you reproduce this in microk8s install? In an Ubuntu vm it'd b sudo snap install microk8s --classic
.
I will try to reproduce it. In the meantime you can try Minikube that is a supported environment.
@andresmgot thanks for giving it a try!
I have already tried Minikube but it does not work with emulated CPUs which the vm provider provides. Moreover it does not work with the tiny vms that I use (1core, 2gb ram).
Have you tried https://github.com/kubernetes-sigs/kind ? It's also a lightweight Kubernetes distribution and I have tested Kubeless in that platform before.
I haven't tried that yet. In fact I can run Kubeless on top of Kubernetes with scheduling inside a VM with 2cores and 4gb of ram. Here is my Ansible-Playbook for that. What I am trying to do, is fitting everything inside a 1core,2gb instance.
I found the problem with this. I was able to reproduce the issue using the latest Minikube version. The problem was that the memory limit set is too low. This commit fixes it:
https://github.com/kubeless/cronjob-trigger/pull/9/commits/61e04eac028c3f51d7ce6c24604ca22a425c5426
It will be available in the next release.
@andresmgot I'll try with the git version and report back. Thank you very much!
@andresmgot I tried scheduling functions with the new v1.0.4 release and it worked perfectly. Thank you!
P.S for future vistors: I had to enable dns on microk8s first microk8s.enable dns
.
Is this a BUG REPORT or FEATURE REQUEST?: This is bug on fresh Ubuntu 18.04 and microk8s single node cluster installation.
What happened: Functions do not run on schedule due to error.
What you expected to happen: Functions should run on schedule.
How to reproduce it (as minimally and precisely as possible):
Function deploy and runs when called perfectly. The schedule fails however.
The error is the same for each trigger pod:
Anything else we need to know?:
Environment:
kubectl version
):kubeless version
):