Closed cwoolum closed 3 years ago
I1202 20:45:13.065564 8976 trace.go:116] Trace[1118593115]: "Update" url:/api/v1/namespaces/kube-system/endpoints/rancher.io-local-path,user-agent:local-path-provisioner/v0.0.0 (linux/arm64) kubernetes/$Format,client:10.42.0.5 (started: 2020-12-02 20:45:01.181165632 +0000 UTC m=+83.314597930) (total time: 11.884231395s):
Trace[1118593115]: [11.884231395s] [11.883804717s] END
E1202 20:45:11.696999 8976 leaderelection.go:356] Failed to update lock: Put https://127.0.0.1:6444/api/v1/namespaces/kube-system/endpoints/cloud-controller-manager: context deadline exceeded
I1202 20:45:11.697467 8976 trace.go:116] Trace[463620426]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-12-02 20:45:01.74087159 +0000 UTC m=+83.874303888) (total time: 9.956464373s):
Trace[463620426]: [9.039221883s] [9.036212381s] Transaction committed
Trace[463620426]: [9.956464373s] [915.207378ms] END
E1202 20:45:11.953410 8976 leaderelection.go:356] Failed to update lock: Put https://127.0.0.1:6444/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s: context deadline exceeded
I1202 20:45:11.954804 8976 trace.go:116] Trace[22746264]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-12-02 20:45:01.99803945 +0000 UTC m=+84.131471747) (total time: 9.956183886s):
Trace[22746264]: [9.003243307s] [9.00081496s] Transaction committed
Trace[22746264]: [9.956183886s] [950.672304ms] END
I1202 20:45:12.000414 8976 trace.go:116] Trace[683943658]: "Update" url:/api/v1/namespaces/kube-system/endpoints/kube-controller-manager,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf/leader-election,client:127.0.0.1 (started: 2020-12-02 20:45:01.997261574 +0000 UTC m=+84.130693872) (total time: 10.002829094s):
Trace[683943658]: [10.002829094s] [10.002248863s] END
Your SD card is unable to support both workload and control-plane IO. It is overloaded to the point where datastore writes are taking > 10 seconds. After 10 seconds many internal Kubernetes components will exceed a hardcoded timeout and fatally terminate the process.
You might try adding some external storage for /var/lib/rancher - USB attached SSD is your best bet.
I swapped my SD Cards. I am now using a SanDisk Extreme Plus on the master node which fixed the problem. The agent node is now using the SanDisk Ultra. I didn't realize there was such a huge performance difference but it's very noticeable. Thanks for the help!
This was working until I deployed all of my workloads. After deploying the Prometheus Operator, etcd was super overloaded again. I do have an external SSD mapped to /mnt/ssd
. If I'm using k3os, will I be able to symlink /var/lib/rancher
to the SSD in boot_cmd
or are the k3s
and k3os
folders pre-existing in that folder?
I believe you should be able to do that; give it a try!
Unfortunately, I don't think symlinking that folder will work with k3os. I'll see if the k3os team can help me.
I'm a bit surprised that this SD Card isn't fast enough to host this. Based on specs, it seems plenty performant.
Prometheus really wants SSD. I can't even get away with running it on rotational disk without it DoSing other workloads on the same volume.
I just want to confirm that we are talking about storage for etcd and not persistent volume storage for the pods. I am running all of the persistent storage on the ssd already as a nfs share.
Ah, I had missed that you had already split off the workload volumes.
If you're only going to ever run with a single server you might stick with sqlite instead of etcd. etcd is not super high throughput but blocks on a fsync after every write to the datastore. This makes it extremely latency sensitive, and also more likely to burn through the write cycles on your SD card.
I currently have 2 Raspberry Pi 4 nodes and wanted to add a 3rd and eventually move to the experimental HA embedded etcd. In the current config, I have one server and one agent.
Do you recommend against going with that approach? I was hoping to avoid setting up a separate Pi just to host a db.
You can certainly try it, but keep in mind that we haven't done any tuning of etcd for low-end systems so all of its normal hardware requirements are still valid: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md#disks
Okay, I've added a new Raspberry Pi 4 with Postgresql on it as a backing store but Kubernetes is still crashing. The Postgresql database is on an SSD. Here are the logs from the cluster
E1215 22:57:12.954647 26145 leaderelection.go:356] Failed to update lock: Put https://127.0.0.1:6444/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/cloud-controller-manager: context deadline exceeded
I1215 22:57:12.954927 26145 leaderelection.go:277] failed to renew lease kube-system/cloud-controller-manager: timed out waiting for the condition
F1215 22:57:12.955094 26145 controllermanager.go:213] leaderelection lost
time="2020-12-15T22:57:24.334862526Z" level=info msg="Starting k3s v1.18.9+k3s1 (630bebf9)"
time="2020-12-15T22:57:24.338415930Z" level=info msg="Cluster bootstrap already complete"
time="2020-12-15T22:57:24.626516656Z" level=info msg="Kine listening on unix://kine.sock"
time="2020-12-15T22:57:24.630327167Z" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=unknown --authorization-mode=Node,RBAC --basic-auth-file=/var/lib/rancher/k3s/server/cred/passwd --bind-address=127.0.0.1 --cert-dir=/var/lib/rancher/k3s/server/tls/temporary-certs --client-ca-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --enable-admission-plugins=NodeRestriction --etcd-servers=unix://kine.sock --insecure-port=0 --kubelet-certificate-authority=/var/lib/rancher/k3s/server/tls/server-ca.crt --kubelet-client-certificate=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt --kubelet-client-key=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.key --proxy-client-cert-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt --proxy-client-key-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.key --requestheader-allowed-names=system:auth-proxy --requestheader-client-ca-file=/var/lib/rancher/k3s/server/tls/request-header-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6444 --service-account-issuer=k3s --service-account-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-cluster-ip-range=10.43.0.0/16 --storage-backend=etcd3 --tls-cert-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt --tls-private-key-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.key"
Flag --basic-auth-file has been deprecated, Basic authentication mode is deprecated and will be removed in a future release. It is not recommended for production environments.
I1215 22:57:24.635041 26490 server.go:645] external host was not specified, using 192.168.40.2
I1215 22:57:24.636431 26490 server.go:162] Version: v1.18.9+k3s1
I1215 22:57:24.665884 26490 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I1215 22:57:24.666054 26490 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
I1215 22:57:24.672931 26490 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I1215 22:57:24.673100 26490 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
I1215 22:57:24.856223 26490 master.go:270] Using reconciler: lease
I1215 22:57:25.198710 26490 rest.go:113] the default service ipfamily for this cluster is: IPv4
I1215 22:57:25.782205 26490 trace.go:116] Trace[1045366144]: "List etcd3" key:/pods,resourceVersion:,limit:10000,continue: (started: 2020-12-15 22:57:25.11095305 +0000 UTC m=+2.088864669) (total time: 671.087107ms):
Trace[1045366144]: [671.087107ms] [671.087107ms] END
These are the parameters I'm passing to k3s
server --datastore-endpoint=postgres://k8s:redacted@192.168.40.1/k3s --disable=traefik,servicelb
Can you attach the full logs? There's not really enough there to see what's going on before or after the restart.
Here are the last few hours. Let me know if you need more. The full log file was 96 MB so I needed to slim it down.
I'm seeing things related to etcd3
in the logs. Is that just a labeling issue? Etcd shouldn't actually be used in this configuration, should it?
Thanks, I'll take a look at the logs. Kine is still etcd as far as Kubernetes is concerned.
Is there any reason you're still on 1.18.9? There have been quite a few changes since then, including some that improve performance of the kine storage engine. If you're running everything on one host, I would actually recommend sticking with sqlite as opposed to adding the overhead of postgres or mysql.
It looks like your system is still having trouble keeping up with datastore operations. Can you share any more info on how you have your system partitioned? What is on the SSD, what is on the SD card? What does system load and IOPS look like when this occurs?
I1215 22:46:39.910212 28937 trace.go:116] Trace[44432836]: "Get" url:/api/v1/namespaces/kube-system/configmaps/cert-manager-cainjector-leader-election,user-agent:cainjector/v0.0.0 (linux/arm64) kubernetes/$Format/leader-election,client:192.168.40.3 (started: 2020-12-15 22:46:38.222428524 +0000 UTC m=+4212.259529000) (total time: 1.687353158s):
Trace[44432836]: [1.685838702s] [1.685789333s] About to write a response
E1215 22:46:40.431953 28937 leaderelection.go:356] Failed to update lock: Put https://127.0.0.1:6444/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler?timeout=10s: context deadline exceeded
I1215 22:46:40.433098 28937 leaderelection.go:277] failed to renew lease kube-system/kube-scheduler: timed out waiting for the condition
F1215 22:46:40.433422 28937 server.go:244] leaderelection lost
E1215 22:46:47.083523 28937 controller.go:178] failed to update node lease, error: Put https://127.0.0.1:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-1?timeout=10s: context deadline exceeded
E1215 22:46:40.790278 28937 leaderelection.go:356] Failed to update lock: Put https://127.0.0.1:6444/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/cloud-controller-manager: context deadline exceeded
I1215 22:46:48.487898 28937 leaderelection.go:277] failed to renew lease kube-system/cloud-controller-manager: timed out waiting for the condition
F1215 22:46:48.488397 28937 controllermanager.go:213] leaderelection lost
I1215 22:46:41.039250 28937 trace.go:116] Trace[1401285353]: "GuaranteedUpdate etcd3" type:*coordination.Lease (started: 2020-12-15 22:46:37.687045851 +0000 UTC m=+4211.724146383) (total time: 3.351867763s):
Trace[1401285353]: [3.351867763s] [3.347675533s] END
I1215 22:46:41.067207 28937 trace.go:116] Trace[1829012881]: "Get" url:/api/v1/namespaces/default,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf,client:127.0.0.1 (started: 2020-12-15 22:46:38.627750808 +0000 UTC m=+4212.664851321) (total time: 2.43922889s):
Trace[1829012881]: [2.425721222s] [2.425604003s] About to write a response
I1215 22:46:41.418948 28937 trace.go:116] Trace[1320012864]: "GuaranteedUpdate etcd3" type:*coordination.Lease (started: 2020-12-15 22:46:37.729888843 +0000 UTC m=+4211.766989411) (total time: 3.688684122s):
Trace[1320012864]: [3.688684122s] [3.68536157s] END
I1215 22:46:41.701868 28937 trace.go:116] Trace[850776731]: "List etcd3" key:/pods,resourceVersion:,limit:0,continue: (started: 2020-12-15 22:46:36.064168255 +0000 UTC m=+4210.101268768) (total time: 5.637501404s):
Trace[850776731]: [5.637501404s] [5.637501404s] END
I1215 22:46:41.735352 28937 trace.go:116] Trace[1236525513]: "GuaranteedUpdate etcd3" type:*core.ConfigMap (started: 2020-12-15 22:46:37.664163445 +0000 UTC m=+4211.701263976) (total time: 4.070898516s):
Trace[1236525513]: [4.070701595s] [4.067042405s] Transaction committed
I1215 22:46:41.821225 28937 trace.go:116] Trace[259273813]: "GuaranteedUpdate etcd3" type:*coordination.Lease (started: 2020-12-15 22:46:37.264137407 +0000 UTC m=+4211.301237920) (total time: 4.556880902s):
Trace[259273813]: [304.957755ms] [304.693244ms] Transaction prepared
Trace[259273813]: [4.556569208s] [4.251611453s] Transaction committed
I1215 22:46:44.727003 28937 trace.go:116] Trace[497163353]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-12-15 22:46:39.887678081 +0000 UTC m=+4213.924778613) (total time: 4.839078762s):
Trace[497163353]: [293.211632ms] [293.074655ms] Transaction prepared
Trace[497163353]: [4.838910082s] [4.54569845s] Transaction committed
E1215 22:46:44.945545 28937 leaderelection.go:356] Failed to update lock: resource name may not be empty
I1215 22:46:45.016542 28937 trace.go:116] Trace[1609514022]: "GuaranteedUpdate etcd3" type:*core.ConfigMap (started: 2020-12-15 22:46:40.24842112 +0000 UTC m=+4214.285521652) (total time: 4.767691527s):
Trace[1609514022]: [547.604764ms] [547.193516ms] Transaction prepared
Trace[1609514022]: [4.767459681s] [4.219854917s] Transaction committed
E1215 22:46:45.107452 28937 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
I1215 22:46:45.294639 28937 trace.go:116] Trace[1494008030]: "GuaranteedUpdate etcd3" type:*core.ConfigMap (started: 2020-12-15 22:46:39.919424496 +0000 UTC m=+4213.956525028) (total time: 5.374877449s):
Trace[1494008030]: [5.374357612s] [5.369947795s] Transaction committed
I1215 22:46:45.301265 28937 trace.go:116] Trace[902924431]: "Get" url:/apis/apps/v1/namespaces/flux-system/replicasets/source-controller-f8fdb486,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf/system:serviceaccount:kube-system:replicaset-controller,client:127.0.0.1 (started: 2020-12-15 22:46:38.985781817 +0000 UTC m=+4213.022882274) (total time: 6.315069611s):
Trace[902924431]: [6.314190784s] [6.314155062s] About to write a response
I1215 22:46:47.108312 28937 trace.go:116] Trace[1561342929]: "Update" url:/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-1,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf,client:127.0.0.1 (started: 2020-12-15 22:46:36.898393326 +0000 UTC m=+4210.935493802) (total time: 10.209653714s):
Trace[1561342929]: [10.209653714s] [10.208249532s] END
I1215 22:46:47.122337 28937 trace.go:116] Trace[324341722]: "List etcd3" key:/minions,resourceVersion:,limit:0,continue: (started: 2020-12-15 22:46:42.330160539 +0000 UTC m=+4216.367261052) (total time: 4.791889517s):
Trace[324341722]: [4.791889517s] [4.791889517s] END
I1215 22:46:47.123968 28937 trace.go:116] Trace[1595965141]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-12-15 22:46:39.261320827 +0000 UTC m=+4213.298421340) (total time: 7.862508086s):
Trace[1595965141]: [7.862508086s] [7.85862568s] END
I now have 3 Raspberry PI's
Pi 1:
Pi 2:
Pi 3:
I moved the SSD to a separate Raspberry PI now that Postgresql runs on. The k3s server and agent now run exclusively on microsd cards but all pod persistent storage happens on NFS shared from the first PI. I'm not sure how to test IOPS on k3os?
I have automatic updates enabled but the system keeps boot looping before the the updates can occur
Seeing this too on ubuntu focal vm on windows hyper-v 2019 servers. Using v1.20.0+k3s2
Having one db vm with 2GB ram running mariadb (~30% usage), one master node with 4GB ram. Both are baked by ssd.
Jan 13 12:35:19 docker-m02 k3s[1459387]: /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:975 +0x19b
Jan 13 12:35:19 docker-m02 k3s[1459387]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.(*loggingT).output(0x701b3e0, 0xc000000003, 0x0, 0x0, 0xc00029d0a0, 0x6ca6996, 0x9, 0xcd, 0x0)
Jan 13 12:35:19 docker-m02 k3s[1459387]: /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:1026 +0xb9
Jan 13 12:35:19 docker-m02 k3s[1459387]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.stacks(0xc000010001, 0xc00c35e280, 0x41, 0x123)
Jan 13 12:35:19 docker-m02 k3s[1459387]: goroutine 3602 [running]:
Jan 13 12:35:19 docker-m02 k3s[1459387]: F0113 12:35:19.502155 1459387 server.go:205] leaderelection lost
Jan 13 12:35:19 docker-m02 k3s[1459387]: I0113 12:35:19.502130 1459387 leaderelection.go:278] failed to renew lease kube-system/kube-scheduler: timed out waiting for the condition
Jan 13 12:35:19 docker-m02 k3s[1459387]: E0113 12:35:19.502006 1459387 leaderelection.go:361] Failed to update lock: Put "https://127.0.0.1:6444/apis/coordination.k8s.io/v1/namespaces/kube-system/>
Jan 13 12:35:19 docker-m02 k3s[1459387]: Trace[1873575128]: [9.994547965s] [9.994547965s] END
Jan 13 12:35:19 docker-m02 k3s[1459387]: I0113 12:35:19.501182 1459387 trace.go:205] Trace[1873575128]: "Update" url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler,user->
Jan 13 12:35:19 docker-m02 k3s[1459387]: time="2021-01-13T12:35:19.501085416+01:00" level=error msg="error in txn: context canceled"
Jan 13 12:35:19 docker-m02 k3s[1459387]: E0113 12:35:19.500948 1459387 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
Jan 13 12:35:19 docker-m02 k3s[1459387]: Trace[1419650026]: [9.994151447s] [9.994151447s] END
Jan 13 12:35:19 docker-m02 k3s[1459387]: I0113 12:35:19.500885 1459387 trace.go:205] Trace[1419650026]: "GuaranteedUpdate etcd3" type:*coordination.Lease (13-Jan-2021 12:35:09.506) (total time: 99>
Jan 13 12:35:19 docker-m02 k3s[1459387]: E0113 12:35:19.258422 1459387 controller.go:187] failed to update lease, error: Put "https://127.0.0.1:6443/apis/coordination.k8s.io/v1/namespaces/kube-nod>
Jan 13 12:35:19 docker-m02 k3s[1459387]: Trace[588737688]: [9.998288231s] [9.998288231s] END
Jan 13 12:35:19 docker-m02 k3s[1459387]: I0113 12:35:19.257152 1459387 trace.go:205] Trace[588737688]: "Update" url:/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/docker-m02,user-a>
Jan 13 12:35:19 docker-m02 k3s[1459387]: time="2021-01-13T12:35:19.256921672+01:00" level=error msg="error in txn: context canceled"
Jan 13 12:35:19 docker-m02 k3s[1459387]: E0113 12:35:19.256782 1459387 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
Jan 13 12:35:19 docker-m02 k3s[1459387]: Trace[412436448]: [9.9975974s] [9.9975974s] END
Jan 13 12:35:19 docker-m02 k3s[1459387]: I0113 12:35:19.256689 1459387 trace.go:205] Trace[412436448]: "GuaranteedUpdate etcd3" type:*coordination.Lease (13-Jan-2021 12:35:09.259) (total time: 999>
Jan 13 12:35:18 docker-m02 k3s[1459387]: time="2021-01-13T12:35:18.676181581+01:00" level=error msg="error in txn: context canceled"
Jan 13 12:35:18 docker-m02 k3s[1459387]: Trace[586832888]: [9.992906492s] [9.992906492s] END
Jan 13 12:35:18 docker-m02 k3s[1459387]: I0113 12:35:18.676090 1459387 trace.go:205] Trace[586832888]: "Update" url:/api/v1/namespaces/kube-system/endpoints/rancher.io-local-path,user-agent:local->
Jan 13 12:35:18 docker-m02 k3s[1459387]: E0113 12:35:18.675719 1459387 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
Seeing a constand write rate on db node of ~4MB/s
Warnings from mariadb server:
@viceice your datastore operation times are too slow, far slower than I would expect to see from SSD:
Trace[1873575128]: [9.994547965s] [9.994547965s] END
@brandond yes, that's very strange. There are 6 sata3 ssd with ~450mb/s read-write capacity. On synthetic writes i can see ~2gb/s write io.
I think i need to make some iops tests.
BTW: i have some more vm's running on that server. I've two more k3s single master nodes running. One uses integrated etcd and one test cluster with sqlite.
Sqlite shows best performance, but isn't clusterable.
Are there any comparison between mysql and postgresql? Which performs better for 3 k3s cluster with around 4 master and 8 worker nodes?
I'll check running the db on pcie ssd next week.
Very strange, far to low iops for a 6x 1TB ssd storage space: 🤔
# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.16
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=41.8MiB/s,w=13.8MiB/s][r=10.7k,w=3537 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=30295: Fri Jan 15 11:24:54 2021
read: IOPS=9562, BW=37.4MiB/s (39.2MB/s)(3070MiB/82184msec)
bw ( KiB/s): min= 552, max=62944, per=100.00%, avg=38740.52, stdev=12204.15, samples=162
iops : min= 138, max=15736, avg=9685.11, stdev=3051.03, samples=162
write: IOPS=3195, BW=12.5MiB/s (13.1MB/s)(1026MiB/82184msec); 0 zone resets
bw ( KiB/s): min= 112, max=20816, per=100.00%, avg=12946.37, stdev=4068.42, samples=162
iops : min= 28, max= 5204, avg=3236.58, stdev=1017.10, samples=162
cpu : usr=3.00%, sys=15.94%, ctx=346505, majf=0, minf=9
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=37.4MiB/s (39.2MB/s), 37.4MiB/s-37.4MiB/s (39.2MB/s-39.2MB/s), io=3070MiB (3219MB), run=82184-82184msec
WRITE: bw=12.5MiB/s (13.1MB/s), 12.5MiB/s-12.5MiB/s (13.1MB/s-13.1MB/s), io=1026MiB (1076MB), run=82184-82184msec
Disk stats (read/write):
sda: ios=785584/262620, merge=31/117, ticks=2129386/3081022, in_queue=3775352, util=99.96%
Here a vm on a single pcie-ssd, which is more what i expect.
# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.16
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=272MiB/s,w=90.6MiB/s][r=69.5k,w=23.2k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=1799555: Fri Jan 15 11:23:06 2021
read: IOPS=68.6k, BW=268MiB/s (281MB/s)(3070MiB/11450msec)
bw ( KiB/s): min=246584, max=281480, per=99.95%, avg=274421.82, stdev=8581.23, samples=22
iops : min=61646, max=70370, avg=68605.45, stdev=2145.31, samples=22
write: IOPS=22.9k, BW=89.6MiB/s (93.0MB/s)(1026MiB/11450msec); 0 zone resets
bw ( KiB/s): min=84144, max=94408, per=100.00%, avg=91797.82, stdev=2451.21, samples=22
iops : min=21036, max=23602, avg=22949.45, stdev=612.80, samples=22
cpu : usr=9.76%, sys=43.08%, ctx=114862, majf=0, minf=10
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=268MiB/s (281MB/s), 268MiB/s-268MiB/s (281MB/s-281MB/s), io=3070MiB (3219MB), run=11450-11450msec
WRITE: bw=89.6MiB/s (93.0MB/s), 89.6MiB/s-89.6MiB/s (93.0MB/s-93.0MB/s), io=1026MiB (1076MB), run=11450-11450msec
Disk stats (read/write):
sda: ios=784233/262068, merge=47/0, ticks=634898/48918, in_queue=2044, util=99.24%
This looks really bad, made on physical host.
------------------------------------------------------------------------------
CrystalDiskMark 8.0.0 x64 (C) 2007-2020 hiyohiyo
Crystal Dew World: https://crystalmark.info/
------------------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes
[Read]
SEQ 1MiB (Q= 8, T= 1): 1059.225 MB/s [ 1010.2 IOPS] < 7871.99 us>
SEQ 1MiB (Q= 1, T= 1): 320.652 MB/s [ 305.8 IOPS] < 3266.88 us>
RND 4KiB (Q= 32, T= 1): 116.168 MB/s [ 28361.3 IOPS] < 1125.77 us>
RND 4KiB (Q= 1, T= 1): 7.341 MB/s [ 1792.2 IOPS] < 555.41 us>
[Write]
SEQ 1MiB (Q= 8, T= 1): 389.238 MB/s [ 371.2 IOPS] < 21290.93 us>
SEQ 1MiB (Q= 1, T= 1): 54.114 MB/s [ 51.6 IOPS] < 18817.60 us>
RND 4KiB (Q= 32, T= 1): 19.561 MB/s [ 4775.6 IOPS] < 6686.48 us>
RND 4KiB (Q= 1, T= 1): 1.273 MB/s [ 310.8 IOPS] < 3211.28 us>
Profile: Default
Test: 1 GiB (x5) [E: 86% (2195/2560GiB)]
Mode: [Admin]
Time: Measure 5 sec / Interval 5 sec
Date: 2021/01/15 11:46:23
OS: Windows Server 2019 Datacenter (Full installation) [10.0 Build 17763] (x64)
Looks like the problem are to full disks. 4 of 6 1Tb ssd have only 80GB free space. 🙈
Ordered two more ssd's now and moveing some vms to get more free space.
btw: another server with 8HDD
------------------------------------------------------------------------------
CrystalDiskMark 8.0.0 x64 (C) 2007-2020 hiyohiyo
Crystal Dew World: https://crystalmark.info/
------------------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes
[Read]
SEQ 1MiB (Q= 8, T= 1): 89.761 MB/s [ 85.6 IOPS] < 92752.58 us>
SEQ 1MiB (Q= 1, T= 1): 36.073 MB/s [ 34.4 IOPS] < 28983.60 us>
RND 4KiB (Q= 32, T= 1): 2.555 MB/s [ 623.8 IOPS] < 50029.35 us>
RND 4KiB (Q= 1, T= 1): 0.506 MB/s [ 123.5 IOPS] < 8020.01 us>
[Write]
SEQ 1MiB (Q= 8, T= 1): 30.824 MB/s [ 29.4 IOPS] <256822.78 us>
SEQ 1MiB (Q= 1, T= 1): 16.147 MB/s [ 15.4 IOPS] < 64199.93 us>
RND 4KiB (Q= 32, T= 1): 0.593 MB/s [ 144.8 IOPS] <217217.98 us>
RND 4KiB (Q= 1, T= 1): 0.352 MB/s [ 85.9 IOPS] < 11552.70 us>
Profile: Default
Test: 1 GiB (x5) [D: 66% (1694/2560GiB)]
Mode: [Admin]
Time: Measure 5 sec / Interval 5 sec
Date: 2021/01/15 12:39:15
OS: Windows Server 2019 Datacenter (Full installation) [10.0 Build 17763] (x64)
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.
Environmental Info: K3s Version:
k3s version v1.18.9+k3s1 (630bebf9) k3os version v0.11.1 <-- this occurred in 10.0 as well
Node(s) CPU architecture, OS, and Version:
Linux k3s-1 4.19.97-v8+ #1294 SMP PREEMPT Thu Jan 30 13:27:08 GMT 2020 aarch64 GNU/Linux
2 - Raspberry PI 4's running on Class 10 UHS-1 SD Cards
Cluster Configuration:
1 master, 1 agent
Describe the bug:
After a new install of k3os/k3s, etcd keeps crashing every few minutes.
Steps To Reproduce:
server --cluster-init --disable=traefik,servicelb
agent
Expected behavior:
I would expect kubernetes to run stable.
Actual behavior:
Kubernetes crashes every couple of minutes after. Pods that rely on leader election also seem to crash.
Additional context / logs:
Logs
``` Trace[148579825]: [661.953235ms] [661.523057ms] Object stored in database I1202 20:45:01.040074 8976 trace.go:116] Trace[1934252170]: "GuaranteedUpdate etcd3" type:*coordination.Lease (started: 2020-12-02 20:44:59.468270221 +0000 UTC m=+81.601702519) (total time: 1.571678379s): Trace[1934252170]: [1.571534493s] [1.569092018s] Transaction committed I1202 20:45:01.040962 8976 trace.go:116] Trace[158077054]: "Update" url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf/leader-election,client:127.0.0.1 (started: 2020-12-02 20:44:59.46773849 +0000 UTC m=+81.601170788) (total time: 1.573054522s): Trace[158077054]: [1.572577086s] [1.572250481s] Object stored in database I1202 20:45:04.703977 8976 trace.go:116] Trace[1309380047]: "List etcd3" key:/events,resourceVersion:,limit:1000,continue: (started: 2020-12-02 20:45:03.793147584 +0000 UTC m=+85.926579882) (total time: 910.53336ms): Trace[1309380047]: [910.53336ms] [910.53336ms] END I1202 20:45:04.968218 8976 trace.go:116] Trace[560134109]: "List" url:/api/v1/events,user-agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Lens/3.6.7 Chrome/83.0.4103.122 Electron/9.1.2 Safari/537.36,client:127.0.0.1 (started: 2020-12-02 20:45:03.792894978 +0000 UTC m=+85.926327275) (total time: 1.174766967s): Trace[560134109]: [912.909671ms] [912.674175ms] Listing from storage done Trace[560134109]: [1.174751301s] [261.84163ms] Writing http response done count:517 E1202 20:45:10.124876 8976 controller.go:178] failed to update node lease, error: Put https://127.0.0.1:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-1?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers) time="2020-12-02T20:45:10.130410420Z" level=error msg="error in txn: context canceled" I1202 20:45:10.125073 8976 trace.go:116] Trace[1690869822]: "GuaranteedUpdate etcd3" type:*coordination.Lease (started: 2020-12-02 20:45:00.129753967 +0000 UTC m=+82.263186265) (total time: 9.995187714s): Trace[1690869822]: [9.995187714s] [9.992894569s] END I1202 20:45:10.131045 8976 trace.go:116] Trace[2116949679]: "Update" url:/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-1,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf,client:127.0.0.1 (started: 2020-12-02 20:45:00.129205625 +0000 UTC m=+82.262637923) (total time: 10.001699452s): Trace[2116949679]: [10.001699452s] [10.001346384s] END I1202 20:45:11.194877 8976 trace.go:116] Trace[1638256539]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-12-02 20:45:01.181796158 +0000 UTC m=+83.315228456) (total time: 10.01293533s): Trace[1638256539]: [9.162288138s] [9.160022085s] Transaction committed Trace[1638256539]: [10.01293533s] [846.865017ms] END E1202 20:45:13.063493 8976 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"} I1202 20:45:13.065564 8976 trace.go:116] Trace[1118593115]: "Update" url:/api/v1/namespaces/kube-system/endpoints/rancher.io-local-path,user-agent:local-path-provisioner/v0.0.0 (linux/arm64) kubernetes/$Format,client:10.42.0.5 (started: 2020-12-02 20:45:01.181165632 +0000 UTC m=+83.314597930) (total time: 11.884231395s): Trace[1118593115]: [11.884231395s] [11.883804717s] END E1202 20:45:11.696999 8976 leaderelection.go:356] Failed to update lock: Put https://127.0.0.1:6444/api/v1/namespaces/kube-system/endpoints/cloud-controller-manager: context deadline exceeded I1202 20:45:11.697467 8976 trace.go:116] Trace[463620426]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-12-02 20:45:01.74087159 +0000 UTC m=+83.874303888) (total time: 9.956464373s): Trace[463620426]: [9.039221883s] [9.036212381s] Transaction committed Trace[463620426]: [9.956464373s] [915.207378ms] END E1202 20:45:11.953410 8976 leaderelection.go:356] Failed to update lock: Put https://127.0.0.1:6444/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s: context deadline exceeded I1202 20:45:11.954804 8976 trace.go:116] Trace[22746264]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-12-02 20:45:01.99803945 +0000 UTC m=+84.131471747) (total time: 9.956183886s): Trace[22746264]: [9.003243307s] [9.00081496s] Transaction committed Trace[22746264]: [9.956183886s] [950.672304ms] END I1202 20:45:12.000414 8976 trace.go:116] Trace[683943658]: "Update" url:/api/v1/namespaces/kube-system/endpoints/kube-controller-manager,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf/leader-election,client:127.0.0.1 (started: 2020-12-02 20:45:01.997261574 +0000 UTC m=+84.130693872) (total time: 10.002829094s): Trace[683943658]: [10.002829094s] [10.002248863s] END E1202 20:45:13.067344 8976 leaderelection.go:356] Failed to update lock: Put https://127.0.0.1:6444/api/v1/namespaces/kube-system/endpoints/kube-scheduler?timeout=10s: context deadline exceeded I1202 20:45:13.067959 8976 trace.go:116] Trace[924601370]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-12-02 20:45:03.431252866 +0000 UTC m=+85.564685256) (total time: 9.636551546s): Trace[924601370]: [7.795962966s] [7.783669052s] Transaction committed Trace[924601370]: [9.636551546s] [1.838588614s] END I1202 20:45:13.078654 8976 leaderelection.go:277] failed to renew lease kube-system/cloud-controller-manager: timed out waiting for the condition E1202 20:45:13.082806 8976 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"} I1202 20:45:13.083306 8976 leaderelection.go:277] failed to renew lease kube-system/kube-controller-manager: timed out waiting for the condition I1202 20:45:13.088083 8976 event.go:278] Event(v1.ObjectReference{Kind:"Lease", Namespace:"kube-system", Name:"kube-controller-manager", UID:"2b095a6b-cf7c-4767-bbb2-ca5d69073af6", APIVersion:"coordination.k8s.io/v1", ResourceVersion:"6053", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' k3s-1_4f23b3d9-b58d-4c79-84bd-854a7d7c6afa stopped leading I1202 20:45:13.088647 8976 leaderelection.go:277] failed to renew lease kube-system/kube-scheduler: timed out waiting for the condition I1202 20:45:13.091349 8976 event.go:278] Event(v1.ObjectReference{Kind:"Lease", Namespace:"kube-system", Name:"cloud-controller-manager", UID:"d10eee0b-8e31-4b7d-92ed-0035622e0c74", APIVersion:"coordination.k8s.io/v1", ResourceVersion:"6052", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' k3s-1_b78da5f0-164b-4354-b477-803bd79ff913 stopped leading E1202 20:45:13.094074 8976 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"} F1202 20:45:13.104741 8976 controllermanager.go:213] leaderelection lost I1202 20:45:13.108478 8976 trace.go:116] Trace[955622096]: "Update" url:/api/v1/namespaces/kube-system/endpoints/cloud-controller-manager,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf/leader-election,client:127.0.0.1 (started: 2020-12-02 20:45:01.740110714 +0000 UTC m=+83.873542975) (total time: 11.368219885s): Trace[955622096]: [11.368219885s] [11.36765371s] END F1202 20:45:13.109628 8976 controllermanager.go:279] leaderelection lost I1202 20:45:13.113429 8976 endpointslice_controller.go:229] Shutting down endpoint slice controller I1202 20:45:13.114256 8976 event.go:278] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"", Name:"", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' k3s-1_4f23b3d9-b58d-4c79-84bd-854a7d7c6afa stopped leading F1202 20:45:13.114557 8976 server.go:244] leaderelection lost I1202 20:45:13.114653 8976 event.go:278] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"", Name:"", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' k3s-1_b78da5f0-164b-4354-b477-803bd79ff913 stopped leading I1202 20:45:13.115929 8976 trace.go:116] Trace[1502416122]: "Update" url:/api/v1/namespaces/kube-system/endpoints/kube-scheduler,user-agent:k3s/v1.18.9+k3s1 (linux/arm64) kubernetes/630bebf/leader-election,client:127.0.0.1 (started: 2020-12-02 20:45:03.429242772 +0000 UTC m=+85.562675310) (total time: 9.686487495s): Trace[1502416122]: [9.686487495s] [9.684921448s] END I1202 20:45:13.930719 8976 ttl_controller.go:130] Shutting down TTL controller I1202 20:45:13.930902 8976 deployment_controller.go:165] Shutting down deployment controller I1202 20:45:13.930962 8976 pvc_protection_controller.go:113] Shutting down PVC protection controller I1202 20:45:13.930994 8976 daemon_controller.go:300] Shutting down daemon sets controller I1202 20:45:13.931013 8976 expand_controller.go:331] Shutting down expand controller I1202 20:45:13.931113 8976 pv_controller_base.go:311] Shutting down persistent volume controller I1202 20:45:13.931177 8976 replica_set.go:194] Shutting down replicaset controller I1202 20:45:13.931187 8976 gc_controller.go:100] Shutting down GC controller I1202 20:45:13.931333 8976 node_lifecycle_controller.go:593] Shutting down node controller I1202 20:45:13.931394 8976 job_controller.go:157] Shutting down job controller I1202 20:45:13.931514 8976 clusterroleaggregation_controller.go:161] Shutting ```