Closed MirtoBusico closed 4 years ago
@MirtoBusico what are the specs of this machine? Is it a VM or container?
What I see is the kubelet trying to start but the apiserver is not yet operational so it gives up. This could be because the API server is taking too long to start. During a system reboot the API server may be fighting over CPU time with other processes so it takes to long.
Something else that looks strange is that the collected logs seem chopped. Have a look at the timestamps in journalctl.txt
you attach and compare them to the timestamps in the journal.log
in the inspection tarball under snap.microk8s.daemon-kubelet
. The latter has logs up until 08:55:42
. Not sure why, do we have enough disk space?
Can you try a microk8s.stop
and microk8s.start
cycle? This will restart all microk8s services. If this does not help, can you try a sudo systemctl restart snap.microk8s.daemon-kubelet
?
Well, the host have an I7 processor and 32GB RAM The KVM have 8 processors and 24GB RAM The vm have 2 disks
df says
sysop@hoseplavm:~$ df -h
File system Dim. Usati Dispon. Uso% Montato su
udev 12G 0 12G 0% /dev
tmpfs 2,3G 1,3M 2,3G 1% /run
/dev/vda1 98G 9,5G 84G 11% /
tmpfs 12G 0 12G 0% /dev/shm
tmpfs 5,0M 4,0K 5,0M 1% /run/lock
tmpfs 12G 0 12G 0% /sys/fs/cgroup
/dev/loop0 89M 89M 0 100% /snap/core/6964
/dev/loop1 55M 55M 0 100% /snap/lxd/10756
/dev/loop2 208M 208M 0 100% /snap/microk8s/608
tmpfs 2,3G 0 2,3G 0% /run/user/119
tmpfs 1,0M 0 1,0M 0% /var/snap/lxd/common/ns
tmpfs 2,3G 16K 2,3G 1% /run/user/1000
sysop@hoseplavm:~$
zfs says
sysop@hoseplavm:~$ sudo zfs list -o space
NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD
zdata1 191G 1,48G 0B 24K 0B 1,48G
zdata1/containers 191G 201M 0B 24K 0B 201M
zdata1/containers/kubernetes 191G 201M 0B 201M 0B 0B
zdata1/custom 191G 24K 0B 24K 0B 0B
zdata1/custom-snapshots 191G 24K 0B 24K 0B 0B
zdata1/deleted 191G 655M 0B 24K 0B 655M
zdata1/deleted/images 191G 655M 0B 24K 0B 655M
zdata1/deleted/images/3c09483ccd69f33a4819532c103f482f219ae4591cc0d860dfb94193e97a2627 191G 655M 0B 655M 0B 0B
zdata1/images 191G 655M 0B 24K 0B 655M
zdata1/images/c234ecee3baaee25db84af8e3565347e948bfceb3bf7c820bb1ce95adcffeaa8 191G 655M 0B 655M 0B 0B
zdata1/snapshots 191G 24K 0B 24K 0B 0B
sysop@hoseplavm:~$
Journalctl was taken after some time respect the tarball. I suppose that accounts for the difference
The cycle microk8s.stop / microk8s.start ended with the same error The sudo systemctl restart snap.microk8s.daemon-kubelet command starter the kubelet; bu the failed again. Here the report
inspection-report-20190606_143558.tar.gz
NOTE: originally I tried to install microk8s inside an LXD container but microk8s never started; maybe I'l open another issue on this
Sorry. Closed by error.
I do not have a solution for now. I see etcd complaining that reading operations take too long to complete eg:
giu 06 14:35:44 hoseplavm etcd[25682]: request "header:<ID:7587838787223751041 > txn:<compare:<target:MOD key:\"/registry/events/default/hoseplavm.15a59c285a3d131c\" mod_revision:0 > success:<request_put:<key:\"/registry/events/default/hoseplavm.15a59c285a3d131c\" value:\"k8s\\000\\n\\013\\n\\002v1\\022\\005Event\\022\\367\\001\\n_\\n\\032hoseplavm.15a59c285a3d131c\\022\\000\\032\\007default\\\"\\000*$99ed044a-8857-11e9-ad1f-525400b6cf612\\0008\\000B\\010\\010\\237\\221\\344\\347\\005\\020\\000z\\000\\022$\\n\\004Node\\022\\000\\032\\thoseplavm\\\"\\thoseplavm*\\0002\\000:\\000\\032\\tNodeReady\\\"'Node hoseplavm status is now: NodeReady*\\024\\n\\007kubelet\\022\\thoseplavm2\\010\\010\\237\\221\\344\\347\\005\\020\\000:\\010\\010\\237\\221\\344\\347\\005\\020\\000@\\001J\\006NormalR\\000b\\000r\\000z\\000\\032\\000\\\"\\000\" lease:7587838787223751013 > > > " took too long (183.182213ms) to execute
giu 06 14:35:44 hoseplavm etcd[25682]: read-only range request "key:\"/registry/pods/kube-system/kubernetes-dashboard-6fd7f9c494-fwgk7\" " took too long (344.026827ms) to execute
giu 06 14:35:44 hoseplavm etcd[25682]: read-only range request "key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" " took too long (258.434407ms) to execute
giu 06 14:35:44 hoseplavm etcd[25682]: read-only range request "key:\"/registry/csidrivers\" range_end:\"/registry/csidrivert\" count_only:true " took too long (289.23571ms) to execute
giu 06 14:35:46 hoseplavm etcd[25682]: request "header:<ID:7587838787223751067 > txn:<compare:<target:MOD key:\"/registry/events/kube-system/kube-dns-6bfbdd666c-stb78.15a59c28f029cbca\" mod_revision:0 > success:<request_put:<key:\"/registry/events/kube-system/kube-dns-6bfbdd666c-stb78.15a59c28f029cbca\" value:\"k8s\\000\\n\\013\\n\\002v1\\022\\005Event\\022\\230\\003\\ns\\n*kube-dns-6bfbdd666c-stb78.15a59c28f029cbca\\022\\000\\032\\013kube-system\\\"\\000*$9b06f58e-8857-11e9-ad1f-525400b6cf612\\0008\\000B\\010\\010\\241\\221\\344\\347\\005\\020\\000z\\000\\022x\\n\\003Pod\\022\\013kube-system\\032\\031kube-dns-6bfbdd666c-stb78\\\"$2902cc4c-87bc-11e9-80f0-525400b6cf61*\\002v12\\00517273:\\030spec.containers{kubedns}\\032\\006Pulled\\\"cContainer image \\\"gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.7\\\" already present on machine*\\024\\n\\007kubelet\\022\\thoseplavm2\\010\\010\\241\\221\\344\\347\\005\\020\\000:\\010\\010\\241\\221\\344\\347\\005\\020\\000@\\001J\\006NormalR\\000b\\000r\\000z\\000\\032\\000\\\"\\000\" lease:7587838787223751013 > > > " took too long (309.552297ms) to execute
giu 06 14:35:49 hoseplavm etcd[25682]: read-only range request "key:\"/registry/ranges/serviceips\" " took too long (341.163051ms) to execute
giu 06 14:35:49 hoseplavm etcd[25682]: read-only range request "key:\"/registry/cronjobs/\" range_end:\"/registry/cronjobs0\" limit:500 " took too long (341.160855ms) to execute
giu 06 14:35:49 hoseplavm microk8s.daemon-etcd[25682]: WARNING: 2019/06/06 14:35:49 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp: lookup etcd.socket on 127.0.0.53:53: no such host"; Reconnecting to {etcd.socket:2379 0 <nil>}
giu 06 14:35:50 hoseplavm etcd[25682]: request "header:<ID:7587838787223751214 > txn:<compare:<target:MOD key:\"/registry/events/default/hoseplavm.15a59c29225b988f\" mod_revision:0 > success:<request_put:<key:\"/registry/events/default/hoseplavm.15a59c29225b988f\" value:\"k8s\\000\\n\\013\\n\\002v1\\022\\005Event\\022\\346\\001\\n_\\n\\032hoseplavm.15a59c29225b988f\\022\\000\\032\\007default\\\"\\000*$9d335a7c-8857-11e9-be03-525400b6cf612\\0008\\000B\\010\\010\\245\\221\\344\\347\\005\\020\\000z\\000\\022$\\n\\004Node\\022\\000\\032\\thoseplavm\\\"\\thoseplavm*\\0002\\000:\\000\\032\\010Starting\\\"\\024Starting kube-proxy.*\\027\\n\\nkube-proxy\\022\\thoseplavm2\\010\\010\\242\\221\\344\\347\\005\\020\\000:\\010\\010\\242\\221\\344\\347\\005\\020\\000@\\001J\\006NormalR\\000b\\000r\\000z\\000\\032\\000\\\"\\000\" lease:7587838787223751209 > > > " took too long (148.499105ms) to execute
giu 06 14:35:50 hoseplavm etcd[25682]: read-only range request "key:\"/registry/priorityclasses/system-node-critical\" " took too long (106.843838ms) to execute
giu 06 14:35:50 hoseplavm etcd[25682]: read-only range request "key:\"/registry/namespaces/kube-system\" " took too long (106.90726ms) to execute
giu 06 14:35:50 hoseplavm etcd[25682]: read-only range request "key:\"/registry/services/specs/\" range_end:\"/registry/services/specs0\" " took too long (753.748779ms) to execute
giu 06 14:35:50 hoseplavm etcd[25682]: request "header:<ID:7587838787223751238 > txn:<compare:<target:MOD key:\"/registry/masterleases/192.168.202.10\" mod_revision:0 > success:<request_put:<key:\"/registry/masterleases/192.168.202.10\" value:\"k8s\\000\\n\\017\\n\\002v1\\022\\tEndpoints\\022*\\n\\022\\n\\000\\022\\000\\032\\000\\\"\\000*\\0002\\0008\\001B\\000z\\000\\022\\024\\n\\022\\n\\016192.168.202.10\\032\\000\\032\\000\\\"\\000\" lease:7587838787223751236 > > failure:<request_range:<key:\"/registry/masterleases/192.168.202.10\" > > > " took too long (451.566414ms) to execute
giu 06 14:35:54 hoseplavm etcd[25682]: request "header:<ID:7587838787223751257 > txn:<compare:<target:MOD key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" mod_revision:22483 > success:<request_put:<key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" value:\"k8s\\000\\n\\017\\n\\002v1\\022\\tEndpoints\\022\\316\\002\\n\\313\\002\\n\\027kube-controller-manager\\022\\000\\032\\013kube-system\\\"\\000*$778d6e74-87bb-11e9-80f0-525400b6cf612\\0008\\000B\\010\\010\\254\\205\\340\\347\\005\\020\\000b\\350\\001\\n(control-plane.alpha.kubernetes.io/leader\\022\\273\\001{\\\"holderIdentity\\\":\\\"hoseplavm_f32aa734-8856-11e9-b992-525400b6cf61\\\",\\\"leaseDurationSeconds\\\":15,\\\"acquireTime\\\":\\\"2019-06-06T12:31:25Z\\\",\\\"renewTime\\\":\\\"2019-06-06T12:35:53Z\\\",\\\"leaderTransitions\\\":4}z\\000\\032\\000\\\"\\000\" > > failure:<request_range:<key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" > > > " took too long (137.663008ms) to execute
giu 06 14:35:54 hoseplavm etcd[25682]: read-only range request "key:\"/registry/pods/kube-system/heapster-v1.5.2-6b5d7b57f9-pf9jr\" " took too long (741.166459ms) to execute
CPU and disk utilization look healthy?
For LXD we use a few profiles (not recommended as they break the isolation) https://github.com/ubuntu/microk8s/tree/master/tests/lxc
I retried a start from a powered off vm. I'didn't see much disk activity on the host. VDA1 is the only partition on the vm that shows activity; so I prepared the graph below
AFAIK sems normal. The report is inspection-report-20190606_184438.tar.gz
I'll do another test with 4 processors instead of 8 to reduce parallelism.
(Ok, for now I'll don't try LXD)
Well also with 4 processors the result is the same.
Tried to reset but the command never ends and the consolo log is
sysop@hoseplavm:~/Immagini$ microk8s.reset
Calling clean_cluster
Cleaning resources in namespace default
endpoints "kubernetes" deleted
event "hoseplavm.15a5a8f07c462d2f" deleted
event "hoseplavm.15a5a8f13b943b52" deleted
event "hoseplavm.15a5a8f13bd2568d" deleted
event "hoseplavm.15a5a8f1410d1bb0" deleted
event "hoseplavm.15a5a8f1410d3c66" deleted
event "hoseplavm.15a5a8f1410d4bab" deleted
event "hoseplavm.15a5a8f14124aa79" deleted
event "hoseplavm.15a5a8f170909b85" deleted
event "hoseplavm.15a5a8f62b369d16" deleted
event "hoseplavm.15a5a9b388013d8e" deleted
event "hoseplavm.15a5a9b95b13b907" deleted
event "hoseplavm.15a5a9c2ab940dab" deleted
event "hoseplavm.15a5ab3f0b19a239" deleted
event "hoseplavm.15a5ab44d3096c15" deleted
event "hoseplavm.15a5ab8f4ed766af" deleted
event "hoseplavm.15a5ab94c86afd90" deleted
secret "default-token-j7gsr" deleted
serviceaccount "default" deleted
service "kubernetes" deleted
Cleaning resources in namespace kube-node-lease
secret "default-token-vflrh" deleted
serviceaccount "default" deleted
lease.coordination.k8s.io "hoseplavm" deleted
Cleaning resources in namespace kube-public
secret "default-token-6xfpz" deleted
serviceaccount "default" deleted
Cleaning resources in namespace kube-system
configmap "eventer-config" deleted
configmap "extension-apiserver-authentication" deleted
configmap "heapster-config" deleted
configmap "kube-dns" deleted
configmap "kubernetes-dashboard-settings" deleted
endpoints "heapster" deleted
endpoints "kube-controller-manager" deleted
endpoints "kube-dns" deleted
endpoints "kube-scheduler" deleted
endpoints "kubernetes-dashboard" deleted
endpoints "monitoring-grafana" deleted
endpoints "monitoring-influxdb" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8f1d7ff6975" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8f23005a4dc" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8f4ca1d2cce" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8f4f1fdff3c" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8f4f206f594" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8f788bdc563" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8f7c9acb807" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8f7c9c0d436" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8fa07a7cbbc" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8fa2c780598" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8fa2c8304f2" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8fb1d69d331" deleted
event "heapster-v1.5.2-6b5d7b57f9-pf9jr.15a5a8fb385b4813" deleted
event "kube-controller-manager.15a5a8f5a441a273" deleted
event "kube-controller-manager.15a5a9b84c19c2f3" deleted
event "kube-controller-manager.15a5ab44724d4719" deleted
event "kube-controller-manager.15a5ab945e6803d2" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8f1704f2e60" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8f4e4da5ef1" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8f73e9b50a7" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8f7a0e94351" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8f7a0f672d3" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8fa2819a5ba" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8fa5a15a506" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8fa5a22d5c1" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8fb283a5452" deleted
event "kube-dns-6bfbdd666c-stb78.15a5a8fb3cb2bca9" deleted
event "kube-scheduler.15a5a8f5a44249d6" deleted
event "kube-scheduler.15a5a9b9b7c60913" deleted
event "kube-scheduler.15a5ab44118e2536" deleted
event "kube-scheduler.15a5ab94b114c701" deleted
event "kubernetes-dashboard-6fd7f9c494-fwgk7.15a5a8f187f43914" deleted
event "kubernetes-dashboard-6fd7f9c494-fwgk7.15a5a8f2221c6fb9" deleted
event "kubernetes-dashboard-6fd7f9c494-fwgk7.15a5a8f4bce6d531" deleted
event "kubernetes-dashboard-6fd7f9c494-fwgk7.15a5a8f4e2f3a18b" deleted
event "monitoring-influxdb-grafana-v4-78777c64c8-29vrc.15a5a8f18635b7c7" deleted
event "monitoring-influxdb-grafana-v4-78777c64c8-29vrc.15a5a8f228ee5941" deleted
event "monitoring-influxdb-grafana-v4-78777c64c8-29vrc.15a5a8f4c1fab562" deleted
event "monitoring-influxdb-grafana-v4-78777c64c8-29vrc.15a5a8f4e7c3e335" deleted
event "monitoring-influxdb-grafana-v4-78777c64c8-29vrc.15a5a8f4e7ce08db" deleted
event "monitoring-influxdb-grafana-v4-78777c64c8-29vrc.15a5a8f7b0017ab4" deleted
event "monitoring-influxdb-grafana-v4-78777c64c8-29vrc.15a5a8f7fa56e2d0" deleted
pod "heapster-v1.5.2-6b5d7b57f9-pf9jr" deleted
pod "kube-dns-6bfbdd666c-stb78" deleted
pod "kubernetes-dashboard-6fd7f9c494-fwgk7" deleted
pod "monitoring-influxdb-grafana-v4-78777c64c8-29vrc" deleted
secret "default-token-bwthm" deleted
secret "heapster-token-lskt7" deleted
secret "kube-dns-token-rtssh" deleted
secret "kubernetes-dashboard-certs" deleted
secret "kubernetes-dashboard-key-holder" deleted
secret "kubernetes-dashboard-token-5s9tw" deleted
serviceaccount "default" deleted
serviceaccount "heapster" deleted
serviceaccount "kube-dns" deleted
serviceaccount "kubernetes-dashboard" deleted
service "heapster" deleted
service "kube-dns" deleted
service "kubernetes-dashboard" deleted
service "monitoring-grafana" deleted
service "monitoring-influxdb" deleted
deployment.apps "heapster-v1.5.2" deleted
deployment.apps "kube-dns" deleted
deployment.apps "kubernetes-dashboard" deleted
deployment.apps "monitoring-influxdb-grafana-v4" deleted
event.events.k8s.io "heapster-v1.5.2-6b5d7b57f9-58r6x.15a5abb4020ea2a2" deleted
event.events.k8s.io "heapster-v1.5.2-6b5d7b57f9-58r6x.15a5abb9e8c1341d" deleted
event.events.k8s.io "heapster-v1.5.2-6b5d7b57f9.15a5abb402114b94" deleted
event.events.k8s.io "kube-dns-6bfbdd666c-rcppl.15a5abb45ba14c2e" deleted
event.events.k8s.io "kube-dns-6bfbdd666c-rcppl.15a5abb9e8ba4648" deleted
event.events.k8s.io "kube-dns-6bfbdd666c.15a5abb426fd32c6" deleted
event.events.k8s.io "kubernetes-dashboard-6fd7f9c494-v5m7v.15a5abb494d2790d" deleted
event.events.k8s.io "kubernetes-dashboard-6fd7f9c494-v5m7v.15a5abba3c36b325" deleted
event.events.k8s.io "kubernetes-dashboard-6fd7f9c494.15a5abb4796c3268" deleted
event.events.k8s.io "monitoring-influxdb-grafana-v4-78777c64c8-vdfzp.15a5abb4a4f4b552" deleted
event.events.k8s.io "monitoring-influxdb-grafana-v4-78777c64c8-vdfzp.15a5abba5f3f67fe" deleted
event.events.k8s.io "monitoring-influxdb-grafana-v4-78777c64c8.15a5abb4a0fdd090" deleted
After this nothing happens and I see many processes related to k8
sysop@hoseplavm:~$ ps -elf|grep k8
4 S root 1167 1 0 80 0 - 2648267 - 19:17 ? 00:00:20 /snap/microk8s/608/etcd --data-dir=/var/snap/microk8s/common/var/run/etcd --advertise-client-urls=unix://etcd.socket:2379 --listen-client-urls=unix://etcd.socket:2379
4 S root 1170 1 0 80 0 - 5370 - 19:17 ? 00:00:01 /bin/bash /snap/microk8s/608/apiservice-kicker
4 S root 1173 1 1 80 0 - 54456 - 19:17 ? 00:00:27 /snap/microk8s/608/kube-controller-manager --master=http://127.0.0.1:8080 --service-account-private-key-file=/var/snap/microk8s/608/certs/serviceaccount.key --root-ca-file=/var/snap/microk8s/608/certs/ca.crt --cluster-signing-cert-file=/var/snap/microk8s/608/certs/ca.crt --cluster-signing-key-file=/var/snap/microk8s/608/certs/ca.key --address=127.0.0.1
4 S root 1182 1 0 80 0 - 35761 - 19:17 ? 00:00:05 /snap/microk8s/608/kube-scheduler --master=http://127.0.0.1:8080 --address=127.0.0.1
4 S root 2206 1 1 80 0 - 101745 - 19:17 ? 00:00:41 /snap/microk8s/608/kube-apiserver --insecure-bind-address=127.0.0.1 --cert-dir=/var/snap/microk8s/608/certs --etcd-servers=unix://etcd.socket:2379 --service-cluster-ip-range=10.152.183.0/24 --authorization-mode=AlwaysAllow --basic-auth-file=/var/snap/microk8s/608/credentials/basic_auth.csv --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota --service-account-key-file=/var/snap/microk8s/608/certs/serviceaccount.key --client-ca-file=/var/snap/microk8s/608/certs/ca.crt --tls-cert-file=/var/snap/microk8s/608/certs/server.crt --tls-private-key-file=/var/snap/microk8s/608/certs/server.key --kubelet-client-certificate=/var/snap/microk8s/608/certs/server.crt --kubelet-client-key=/var/snap/microk8s/608/certs/server.key --secure-port=16443 --insecure-port=8080 --requestheader-client-ca-file=/var/snap/microk8s/608/certs/ca.crt
4 S root 2221 1 0 80 0 - 35003 - 19:17 ? 00:00:01 /snap/microk8s/608/kube-proxy --master=http://127.0.0.1:8080 --cluster-cidr=10.152.183.0/24 --kubeconfig=/snap/microk8s/608/kubeproxy.config --proxy-mode=userspace --healthz-bind-address=127.0.0.1
4 S sysop 6078 4376 0 80 0 - 3257 wait 19:19 pts/1 00:00:00 /bin/bash /snap/microk8s/608/microk8s-reset.wrapper
4 S root 6108 1 0 80 0 - 212987 - 19:19 ? 00:00:02 /snap/microk8s/608/bin/containerd --config /var/snap/microk8s/608/args/containerd.toml --root /var/snap/microk8s/common/var/lib/containerd --state /var/snap/microk8s/common/run/containerd --address /var/snap/microk8s/common/run/containerd.sock
0 S sysop 6587 6078 0 80 0 - 36703 futex_ 19:20 pts/1 00:00:01 /snap/microk8s/608/kubectl --kubeconfig=/snap/microk8s/608/client.config delete --all configmaps,endpoints,events,limitranges,persistentvolumeclaims,pods,podtemplates,replicationcontrollers,resourcequotas,secrets,serviceaccounts,services,controllerrevisions.apps,daemonsets.apps,deployments.apps,replicasets.apps,statefulsets.apps,horizontalpodautoscalers.autoscaling,cronjobs.batch,jobs.batch,leases.coordination.k8s.io,events.events.k8s.io,daemonsets.extensions,deployments.extensions,ingresses.extensions,networkpolicies.extensions,replicasets.extensions,ingresses.networking.k8s.io,networkpolicies.networking.k8s.io,poddisruptionbudgets.policy,rolebindings.rbac.authorization.k8s.io,roles.rbac.authorization.k8s.io --namespace=kube-system
0 R sysop 14164 4352 0 80 0 - 3609 - 19:52 pts/0 00:00:00 grep --color=auto k8
sysop@hoseplavm:~$
There are other things I can try?
Since storage is ok and I still see etcd complaining it might be some kind of corruption in the data store. Unfortunately I can only think of a removal and re installation.
Tried and failed again. Step to reproduce:
Now I will do different tentatives changing one thing at time. I'll report asap
First try: disable swap in VM and do not install any addon. SUCCES survived to reboots
I see this
sysop@hoseplavm:~$ microk8s.inspect
[sudo] password di sysop:
Inspecting services
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-apiserver is running
Service snap.microk8s.daemon-proxy is running
Service snap.microk8s.daemon-kubelet is running
Service snap.microk8s.daemon-scheduler is running
Service snap.microk8s.daemon-controller-manager is running
Service snap.microk8s.daemon-etcd is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system info
Copy network configuration to the final report tarball
Copy processes list to the final report tarball
Copy snap list to the final report tarball
Inspect kubernetes cluster
Building the report tarball
Report tarball is at /var/snap/microk8s/608/inspection-report-20190610_120920.tar.gz
sysop@hoseplavm:~$ sudo cat /proc/swaps
Filename Type Size Used Priority
sysop@hoseplavm:~$
sysop@hoseplavm:~$ microk8s.status
microk8s is running
addons:
jaeger: disabled
fluentd: disabled
gpu: disabled
storage: disabled
registry: disabled
rbac: disabled
ingress: disabled
dns: disabled
metrics-server: disabled
linkerd: disabled
prometheus: disabled
istio: disabled
dashboard: disabled
sysop@hoseplavm:~$
Now I'll try to enable dns and dashboard (a report asap)
Sadly I got the same error after the second reboot.
Inspect says
sysop@hoseplavm:~$ microk8s.inspect
[sudo] password di sysop:
Inspecting services
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-apiserver is running
Service snap.microk8s.daemon-proxy is running
FAIL: Service snap.microk8s.daemon-kubelet is not running
For more details look at: sudo journalctl -u snap.microk8s.daemon-kubelet
Service snap.microk8s.daemon-scheduler is running
Service snap.microk8s.daemon-controller-manager is running
Service snap.microk8s.daemon-etcd is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system info
Copy network configuration to the final report tarball
Copy processes list to the final report tarball
Copy snap list to the final report tarball
Inspect kubernetes cluster
Building the report tarball
Report tarball is at /var/snap/microk8s/608/inspection-report-20190610_162916.tar.gz
sysop@hoseplavm:~$
Attached the report inspection-report-20190610_162916.tar.gz
BTW when I installed microk8s, after three hours there was a lot of I/O activity related to etcd. Is this considered normal?
It is strange that you have two etcd processes running.
Would it be possible to create a VM in a non-zfs substrate? I cannot reproduce and I would like to check that zfs is not an issue.
Ok. I'll report asap the machine is ready
I prepared the new machine:
What is missed from th other machine:
Now I'll take a snapshot and proceed with the tests
BTW do you have any reference definition for creating a VM to use?
Well, microk8s witout addons survives two reboots. The etcd processes are always 2
sysop@hoseplavm1:~$ microk8s.inspect
[sudo] password for sysop:
Inspecting services
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-apiserver is running
Service snap.microk8s.daemon-proxy is running
Service snap.microk8s.daemon-kubelet is running
Service snap.microk8s.daemon-scheduler is running
Service snap.microk8s.daemon-controller-manager is running
Service snap.microk8s.daemon-etcd is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system info
Copy network configuration to the final report tarball
Copy processes list to the final report tarball
Copy snap list to the final report tarball
Inspect kubernetes cluster
Building the report tarball
Report tarball is at /var/snap/microk8s/608/inspection-report-20190610_201053.tar.gz
sysop@hoseplavm1:~$ ps -elf|grep etcd
4 S root 795 1 8 80 0 - 101553 - 20:08 ? 00:00:10 /snap/microk8s/608/kube-apiserver --insecure-bind-address=127.0.0.1 --cert-dir=/var/snap/microk8s/608/certs --etcd-servers=unix://etcd.socket:2379 --service-cluster-ip-range=10.152.183.0/24 --authorization-mode=AlwaysAllow --basic-auth-file=/var/snap/microk8s/608/credentials/basic_auth.csv --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota --service-account-key-file=/var/snap/microk8s/608/certs/serviceaccount.key --client-ca-file=/var/snap/microk8s/608/certs/ca.crt --tls-cert-file=/var/snap/microk8s/608/certs/server.crt --tls-private-key-file=/var/snap/microk8s/608/certs/server.key --kubelet-client-certificate=/var/snap/microk8s/608/certs/server.crt --kubelet-client-key=/var/snap/microk8s/608/certs/server.key --secure-port=16443 --insecure-port=8080 --requestheader-client-ca-file=/var/snap/microk8s/608/certs/ca.crt
4 S root 797 1 2 80 0 - 2634539 - 20:08 ? 00:00:03 /snap/microk8s/608/etcd --data-dir=/var/snap/microk8s/common/var/run/etcd --advertise-client-urls=unix://etcd.socket:2379 --listen-client-urls=unix://etcd.socket:2379
0 S sysop 2679 2228 0 80 0 - 3608 pipe_w 20:10 pts/1 00:00:00 grep --color=auto etcd
sysop@hoseplavm1:~$
Now I'll install dns and dashboard addon and repeat the test
FAILED at the second reboot After I installed dns and dashboard addons the problem appears again.
sysop@hoseplavm1:~$ microk8s.inspect
[sudo] password for sysop:
Inspecting services
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-apiserver is running
Service snap.microk8s.daemon-proxy is running
FAIL: Service snap.microk8s.daemon-kubelet is not running
For more details look at: sudo journalctl -u snap.microk8s.daemon-kubelet
Service snap.microk8s.daemon-scheduler is running
Service snap.microk8s.daemon-controller-manager is running
Service snap.microk8s.daemon-etcd is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system info
Copy network configuration to the final report tarball
Copy processes list to the final report tarball
Copy snap list to the final report tarball
Inspect kubernetes cluster
Building the report tarball
Report tarball is at /var/snap/microk8s/608/inspection-report-20190610_204026.tar.gz
sysop@hoseplavm1:~$ rcp /var/snap/microk8s/608/inspection-report-20190610_204026.tar.gz mirto@192.168.201.1://home/mirto
mirto@192.168.201.1's password:
inspection-report-20190610_204026.tar.gz 100% 335KB 30.3MB/s 00:00
sysop@hoseplavm1:~$ ps -elf|grep etcd
4 S root 798 1 1 80 0 - 2635724 - 20:27 ? 00:00:14 /snap/microk8s/608/etcd --data-dir=/var/snap/microk8s/common/var/run/etcd --advertise-client-urls=unix://etcd.socket:2379 --listen-client-urls=unix://etcd.socket:2379
4 S root 2076 1 2 80 0 - 101553 - 20:27 ? 00:00:24 /snap/microk8s/608/kube-apiserver --insecure-bind-address=127.0.0.1 --cert-dir=/var/snap/microk8s/608/certs --etcd-servers=unix://etcd.socket:2379 --service-cluster-ip-range=10.152.183.0/24 --authorization-mode=AlwaysAllow --basic-auth-file=/var/snap/microk8s/608/credentials/basic_auth.csv --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota --service-account-key-file=/var/snap/microk8s/608/certs/serviceaccount.key --client-ca-file=/var/snap/microk8s/608/certs/ca.crt --tls-cert-file=/var/snap/microk8s/608/certs/server.crt --tls-private-key-file=/var/snap/microk8s/608/certs/server.key --kubelet-client-certificate=/var/snap/microk8s/608/certs/server.crt --kubelet-client-key=/var/snap/microk8s/608/certs/server.key --secure-port=16443 --insecure-port=8080 --requestheader-client-ca-file=/var/snap/microk8s/608/certs/ca.crt
0 S sysop 7017 5807 0 80 0 - 3608 pipe_w 20:42 pts/1 00:00:00 grep --color=auto etcd
sysop@hoseplavm1:~$
And here is the report
inspection-report-20190610_204026.tar.gz
What can I try?
I need to reproduce your setup. Here is what I do, please give the following a try and tell me if you see the issue:
# Create a VM (2 CPUs and 4 GB of RAM). Multipass will take care of qemu and grabbing the right image
multipass launch ubuntu -n testvm -c 2 -m 4G
# Enter the VM
multipass shell testvm
# inside the VM install microk8s and enable the addons
> sudo snap install microk8s --classic
# Wait for microk8s to become ready
> microk8s.status --wait-ready
> microk8s.enable dns dashboard
# Wait to see the pods running
> watch microk8s.kubectl get all --all-namespaces
# Exit the VM with ctrl-D and reboot it
multipass stop testvm
multipass start testvm
# Enter the VM
multipass shell testvm
# Wait for microk8s to become ready
> microk8s.status --wait-ready
# Wait to see the pods running
> watch microk8s.kubectl get all --all-namespaces
Could you also please share the scripts you have to create the VMs? Thank you
Well, my notebook is my home/office; so I cannot risk to corrupt it. I hope that executing the commands in the VM above described is fine.
[Where can I find the command/package multipass?] Update: I suppose you mean the snap package
About VM scripts: I don't use script because I use the GUI virtual-manager. If it can be useful I can share /var/lib/libvirt (obviously excluding images and snapshots) I'll try and report.
Started with
sysop@hoseplavm1:~$ sudo snap install multipass --beta --classic
2019-06-11T12:41:47+02:00 INFO Waiting for restart...
multipass (beta) 0.7.0 from Canonical✓ installed
sysop@hoseplavm1:~$ multipass launch ubuntu -n testvm -c 2 -m 4G
launch failed: CPU does not support KVM extensions.
sysop@hoseplavm1:~$
So I installed
sudo apt-get install qemu-kvm libvirt-bin ubuntu-vm-builder bridge-utils
Not sufficient:
sysop@hoseplavm1:~$ multipass launch ubuntu -n testvm -c 2 -m 4G
One quick question before we launch … Would you like to help
the Multipass developers, by sending anonymous usage data?
This includes your operating system, which images you use,
the number of instances, their properties and how long you use them.
We’d also like to measure Multipass’s speed.
Send usage data (yes/no/Later)? yes
Thank you!
launch failed: CPU does not support KVM extensions.
sysop@hoseplavm1:~$ egrep -c '(vmx|svm)' /proc/cpuinfo
0
sysop@hoseplavm1:~$ kvm-ok
INFO: Your CPU does not support KVM extensions
INFO: For more detailed results, you should run this as root
HINT: sudo /usr/sbin/kvm-ok
sysop@hoseplavm1:~$ sudo kvm-ok
[sudo] password for sysop:
INFO: Your CPU does not support KVM extensions
KVM acceleration can NOT be used
sysop@hoseplavm1:~$ sudo apt install virt-manager
Trying to use virt-manager
sysop@hoseplavm1:~$ virsh list
Id Name State
----------------------------------------------------
1 generic running
sysop@hoseplavm1:~$ multipass launch ubuntu -n testvm -c 2 -m 4G
launch failed: CPU does not support KVM extensions.
sysop@hoseplavm1:~$
Fails again. As you can see virt-manager was able to create a vm; but multipass complains that there is no hardware acceleration.
I suppose you can use multipass only on bare metal.
BTW if you can say what kernel is used in your configuration, I can try to setup a VM with this kernel
@MirtoBusico I am on 4.15.0-50-generic #54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Can you lead me though your setup process? How can I create the same (kind) of machines you are using for microk8s? Please provide as much detail as possible. Thank you.
@ktsakalozos I am on
Linux mirto-P65 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
on the laptop machine and the same on the vm.
I have my step-by-step instruction document; but it is in italian. Please wait and I'll create anenglish version.
Hi, it is more work than I supposed. The attached manual terminate at the system update. Tomorrow I'll complete the document.
Hi @ktsakalozos another failure at the second reboot. Here the instructions to reproduce microk8s_base_vm_V2.pdf
Here the report inspection-report-20190612_131519.tar.gz
Tell me if you need other details
Tried also with a VM with ubuntu server 18.04.2 installed Uname says
sysop@hoseplamono:~$ uname -a
Linux hoseplamono 4.15.0-51-generic #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
sysop@hoseplamono:~$
This fails at the first reboot
I do not know what to tell you @MirtoBusico I partially followed your setup (minus the network configurations) and all services were coming up with no issues.
Hi @ktsakalozos I'm starting to suspect some kind of hardware problem.
Please can we compare the hardware we used for the test?
I'm using an old Santech C37 notebook For storage I have two 2Tera hdd Seagate Barracuda The processor is an Intel I7 Ram is 32Gb The video card is an Nvidia using proprietary drivers
What hardware are you using?
Don't know if this can be useful, but I ended with a similar problem (endless install never completing) using a KVM machine (satisfying requirements except using SSD) and trying to install on local LXD
@MirtoBusico I am also on an i7 16GB or ram and an SSD. Is it possible the shutdown you are performing is a force power-off?
Thanks @ktsakalozos I always use the shutdown command from the VM.
But what you said about network is interesting.
I use static IP (see page 16 of manual) and networkd daemon
If I'm correct the standard is:
Instead for me:
I'll try asap to create a VM from scratch with standard configuraion and report the results
Well, another failure: at the second startup the kubelet fails
Tried:
The netplan definition is:
sysop@testmicrok8s:~$ cat /etc/netplan/01-network-manager-all.yaml
# Let NetworkManager manage all devices on this system
network:
version: 2
renderer: NetworkManager
sysop@testmicrok8s:~$
Do not know what else to try.
SUCCESS The problem disappeared. Installing in a new virtual machine an Ubuntu 18.04.2 and updating the system up to 8 July 2019 the microk8s installation survived 4 reboots and a shutdown / power-on cycle
Every 2.0s: microk8s.kubectl get all --all-namespaces k3s-master: Mon Jul 8 12:56:51 2019
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-f7867546d-rzg56 1/1 Running 3 77m
kube-system pod/heapster-v1.5.2-844b564688-grngn 4/4 Running 12 72m
kube-system pod/kubernetes-dashboard-7d75c474bb-mbrjb 1/1 Running 3 76m
kube-system pod/monitoring-influxdb-grafana-v4-6b6954958c-c7m2x 2/2 Running 6 76m
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 79m
kube-system service/heapster ClusterIP 10.152.183.43 <none> 80/TCP 76m
kube-system service/kube-dns ClusterIP 10.152.183.10 <none> 53/UDP,53/TCP,9153/TCP 77m
kube-system service/kubernetes-dashboard ClusterIP 10.152.183.118 <none> 443/TCP 76m
kube-system service/monitoring-grafana ClusterIP 10.152.183.116 <none> 80/TCP 76m
kube-system service/monitoring-influxdb ClusterIP 10.152.183.129 <none> 8083/TCP,8086/TCP 76m
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system deployment.apps/coredns 1/1 1 1 77m
kube-system deployment.apps/heapster-v1.5.2 1/1 1 1 76m
kube-system deployment.apps/kubernetes-dashboard 1/1 1 1 76m
kube-system deployment.apps/monitoring-influxdb-grafana-v4 1/1 1 1 76m
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system replicaset.apps/coredns-f7867546d 1 1 1 77m
kube-system replicaset.apps/heapster-v1.5.2-6b794f77c8 0 0 0 76m
kube-system replicaset.apps/heapster-v1.5.2-6f5d55456 0 0 0 73m
kube-system replicaset.apps/heapster-v1.5.2-844b564688 1 1 1 72m
kube-system replicaset.apps/kubernetes-dashboard-7d75c474bb 1 1 1 76m
kube-system replicaset.apps/monitoring-influxdb-grafana-v4-6b6954958c 1 1 1 76m
Don't know what changed
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi all,
after a reboot I have "FAIL: Service snap.microk8s.daemon-kubelet is not running" error.
How can I start snap.microk8s.daemon-kubelet ?
Is it safe, or this indicate some kind of problem?
Inspect says:
Snap says:
Os ia Kubuntu 18.04
journalctl.txt inspection-report-20190606_123226.tar.gz