kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.58k stars 1.32k forks source link

kcp-controller stuck during worker cluster upgrade #8106

Closed huaqing1994 closed 1 year ago

huaqing1994 commented 1 year ago

What steps did you take and what happened: I've found a occasionally issue in our CI environment(3 times a month): We use CAPI to create and upgrade a worker cluster(3 master + 3 worker). I've found that the upgrade occasionally gets stuck after the first new control plane node is created successfully. After the problem occurs, kcp-controller no longer prints any logs about this cluster. The goroutine stack trace of kcp-controller shows that a certain reconcile has been blocked in ETCD MoveLeader().

goroutine 362 [select, 1723 minutes]:
google.golang.org/grpc/internal/transport.(*Stream).waitOnHeader(0xc000ec0240)
    google.golang.org/grpc@v1.47.0/internal/transport/transport.go:324 +0x7c
google.golang.org/grpc/internal/transport.(*Stream).RecvCompress(...)
    google.golang.org/grpc@v1.47.0/internal/transport/transport.go:339
google.golang.org/grpc.(*csAttempt).recvMsg(0xc00074a630, {0x1a27520?, 0xc001cb8c00}, 0x0?)
    google.golang.org/grpc@v1.47.0/stream.go:980 +0xbb
google.golang.org/grpc.(*clientStream).RecvMsg.func1(0x0?)
    google.golang.org/grpc@v1.47.0/stream.go:845 +0x25
google.golang.org/grpc.(*clientStream).withRetry(0xc000ec0000, 0xc000acc300, 0xc000acc2d0)
    google.golang.org/grpc@v1.47.0/stream.go:709 +0xd3
google.golang.org/grpc.(*clientStream).RecvMsg(0xc000ec0000, {0x1a27520?, 0xc001cb8c00?})
    google.golang.org/grpc@v1.47.0/stream.go:844 +0x11f
google.golang.org/grpc.invoke({0x1db6f80?, 0xc001cb8c60?}, {0x1b32a45?, 0xc000ccb740?}, {0x1a273e0, 0xc001cb8bd0}, {0x1a27520, 0xc001cb8c00}, 0xc00125c000?, {0xc0011c6f80, ...})
    google.golang.org/grpc@v1.47.0/call.go:73 +0xd7
go.etcd.io/etcd/client/v3.(*Client).unaryClientInterceptor.func1({0x1db6f80?, 0xc0005368d0?}, {0x1b32a45, 0x24}, {0x1a273e0, 0xc001cb8bd0}, {0x1a27520, 0xc001cb8c00}, 0xc0005b8380, 0x1bfacb0, ...)
    go.etcd.io/etcd/client/v3@v3.5.4/retry_interceptor.go:58 +0x445
google.golang.org/grpc.(*ClientConn).Invoke(0x60?, {0x1db6f80?, 0xc0005368d0?}, {0x1b32a45?, 0x6?}, {0x1a273e0?, 0xc001cb8bd0?}, {0x1a27520?, 0xc001cb8c00?}, {0xc0008b96e0, ...})
    google.golang.org/grpc@v1.47.0/call.go:35 +0x223
go.etcd.io/etcd/api/v3/etcdserverpb.(*maintenanceClient).MoveLeader(0xc000710298, {0x1db6f80, 0xc0005368d0}, 0xc001cb8bd0?, {0xc0008b96e0, 0x4, 0x6})
    go.etcd.io/etcd/api/v3@v3.5.4/etcdserverpb/rpc.pb.go:7391 +0xc9
go.etcd.io/etcd/client/v3.(*retryMaintenanceClient).MoveLeader(0xc0006d84b0, {0x1db6f80, 0xc0005368d0}, 0x32?, {0x2ba7760, 0x3, 0x3})
    go.etcd.io/etcd/client/v3@v3.5.4/retry.go:218 +0x127
go.etcd.io/etcd/client/v3.(*maintenance).MoveLeader(0xc0002dc4c0, {0x1db6f80, 0xc0005368d0}, 0x7dec8de804d9d3c1)
    go.etcd.io/etcd/client/v3@v3.5.4/maintenance.go:252 +0x6f
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/etcd.(*Client).MoveLeader(0xc000da0980?, {0x1db6f80?, 0xc0005368d0?}, 0xc000f99900?)
    sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/etcd/etcd.go:217 +0x37
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal.(*Workload).ForwardEtcdLeadership(0xc000ffe5d0, {0x1db6f80, 0xc0005368d0}, 0xc000481500, 0xc000481bd8)
    sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/workload_cluster_etcd.go:207 +0x591
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/controllers.(*KubeadmControlPlaneReconciler).scaleDownControlPlane(0xc00018b600, {0x1db6f80, 0xc0005368d0}, 0xc001110000, 0xc0014bec00, 0xc000c5a6e0, 0x0?)
    sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/controllers/scale.go:123 +0x268
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/controllers.(*KubeadmControlPlaneReconciler).upgradeControlPlane(0xc00018b600, {0x1db6f80, 0xc0005368d0}, 0x2?, 0xc0014bec00, 0xc000c5a6e0, 0xc0014bec00?)
    sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/controllers/upgrade.go:132 +0xbbc
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/controllers.(*KubeadmControlPlaneReconciler).reconcile(0xc00018b600, {0x1db6f80, 0xc0005368d0}, 0xc001110000, 0xc0014bec00)
    sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/controllers/controller.go:370 +0x1e85
sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/controllers.(*KubeadmControlPlaneReconciler).Reconcile(0xc00018b600, {0x1db6f80, 0xc0005368d0}, {{{0xc00149bd10?, 0x10?}, {0xc00149bda0?, 0x40d927?}}})
    sigs.k8s.io/cluster-api/controlplane/kubeadm/internal/controllers/controller.go:216 +0x770
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1db6ed8?, {0x1db6f80?, 0xc0005368d0?}, {{{0xc00149bd10?, 0x1a22b60?}, {0xc00149bda0?, 0x4041f4?}}})
    sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:121 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0003ef220, {0x1db6ed8, 0xc000225b40}, {0x192e420?, 0xc0011ac1e0?})
    sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:320 +0x33c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0003ef220, {0x1db6ed8, 0xc000225b40})
    sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:273 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
    sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:234 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
    sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:230 +0x325

I found a related PR #7841 which fixes kcp-controller getting stuck. But I want to know the reason and mechanism of this problem, has anyone encountered it?

Environment:

/kind bug

k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
jessehu commented 1 year ago

Hi @jelmersnoeck do you have any insight on the root cause for PR :seedling: Add configurable etcd call timeout by jelmersnoeck · Pull Request #7841 · kubernetes-sigs/cluster-api

jelmersnoeck commented 1 year ago

@jessehu we haven't root caused this yet, but saw the exact same behavior.

chrischdi commented 1 year ago

For more insight and for being able to take a further look into this issue it would be super helpful to have more information.

Some first ideas where we could take a look at is:

huaqing1994 commented 1 year ago

thx @chrischdi !

Here's more information about an issue:

KCP yaml

```yaml apiVersion: controlplane.cluster.x-k8s.io/v1beta1 kind: KubeadmControlPlane metadata: creationTimestamp: "2023-02-01T22:11:00Z" finalizers: - kubeadm.controlplane.cluster.x-k8s.io generation: 2 labels: cluster.x-k8s.io/cluster-name: sks-e2e-test-owk3ls-k8s-upgrade managedFields: - apiVersion: controlplane.cluster.x-k8s.io/v1beta1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: {} v:"kubeadm.controlplane.cluster.x-k8s.io": {} f:labels: .: {} f:cluster.x-k8s.io/cluster-name: {} f:ownerReferences: .: {} k:{"uid":"697e015e-acfc-41ef-b29d-b87397eb0928"}: {} manager: manager operation: Update time: "2023-02-01T22:11:29Z" - apiVersion: controlplane.cluster.x-k8s.io/v1beta1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:kubeadmConfigSpec: .: {} f:clusterConfiguration: .: {} f:apiServer: .: {} f:extraArgs: .: {} f:admission-control-config-file: {} f:audit-log-maxage: {} f:audit-log-maxbackup: {} f:audit-log-maxsize: {} f:audit-log-path: {} f:audit-policy-file: {} f:enable-admission-plugins: {} f:profiling: {} f:request-timeout: {} f:tls-cipher-suites: {} f:extraVolumes: {} f:clusterName: {} f:controllerManager: .: {} f:extraArgs: .: {} f:allocate-node-cidrs: {} f:bind-address: {} f:profiling: {} f:terminated-pod-gc-threshold: {} f:dns: .: {} f:imageRepository: {} f:etcd: {} f:imageRepository: {} f:networking: {} f:scheduler: .: {} f:extraArgs: .: {} f:bind-address: {} f:profiling: {} f:files: {} f:initConfiguration: .: {} f:localAPIEndpoint: {} f:nodeRegistration: .: {} f:kubeletExtraArgs: .: {} f:event-qps: {} f:pod-infra-container-image: {} f:protect-kernel-defaults: {} f:tls-cipher-suites: {} f:name: {} f:joinConfiguration: .: {} f:discovery: {} f:nodeRegistration: .: {} f:kubeletExtraArgs: .: {} f:event-qps: {} f:pod-infra-container-image: {} f:protect-kernel-defaults: {} f:tls-cipher-suites: {} f:name: {} f:preKubeadmCommands: {} f:useExperimentalRetryJoin: {} f:users: {} f:machineTemplate: .: {} f:infrastructureRef: {} f:nodeDrainTimeout: {} f:replicas: {} f:rolloutStrategy: .: {} f:rollingUpdate: .: {} f:maxSurge: {} f:type: {} f:version: {} manager: sks-manager operation: Update time: "2023-02-01T22:22:40Z" - apiVersion: controlplane.cluster.x-k8s.io/v1beta1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:conditions: {} f:initialized: {} f:observedGeneration: {} f:ready: {} f:readyReplicas: {} f:replicas: {} f:selector: {} f:unavailableReplicas: {} f:updatedReplicas: {} f:version: {} manager: manager operation: Update subresource: status time: "2023-02-01T22:25:33Z" name: sks-e2e-test-owk3ls-k8s-upgrade-controlplane namespace: sks-e2e-huhui-test-template-v1-22-191-vflqvz ownerReferences: - apiVersion: cluster.x-k8s.io/v1beta1 blockOwnerDeletion: true controller: true kind: Cluster name: sks-e2e-test-owk3ls-k8s-upgrade uid: 697e015e-acfc-41ef-b29d-b87397eb0928 resourceVersion: "6649" uid: 07081de2-ab3b-4aa0-b2d9-99f248d74810 spec: kubeadmConfigSpec: clusterConfiguration: apiServer: extraArgs: admission-control-config-file: /etc/kubernetes/admission.yaml audit-log-maxage: "30" audit-log-maxbackup: "10" audit-log-maxsize: "100" audit-log-path: /var/log/apiserver/audit.log audit-policy-file: /etc/kubernetes/auditpolicy.yaml enable-admission-plugins: AlwaysPullImages,EventRateLimit profiling: "false" request-timeout: 300s tls-cipher-suites: TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384 extraVolumes: - hostPath: /var/log/apiserver mountPath: /var/log/apiserver name: apiserver-log pathType: DirectoryOrCreate - hostPath: /etc/kubernetes/admission.yaml mountPath: /etc/kubernetes/admission.yaml name: admission-config pathType: FileOrCreate readOnly: true - hostPath: /etc/kubernetes/auditpolicy.yaml mountPath: /etc/kubernetes/auditpolicy.yaml name: audit-policy pathType: FileOrCreate readOnly: true clusterName: sks-e2e-test-owk3ls-k8s-upgrade controllerManager: extraArgs: allocate-node-cidrs: "false" bind-address: 0.0.0.0 profiling: "false" terminated-pod-gc-threshold: "10" dns: imageRepository: registry.smtx.io/kubesmart-e2e-test/huhui.test-template-v1.22/build191/coredns etcd: {} imageRepository: registry.smtx.io/kubesmart-e2e-test/huhui.test-template-v1.22/build191 networking: {} scheduler: extraArgs: bind-address: 0.0.0.0 profiling: "false" files: - content: | apiVersion: v1 kind: Pod metadata: creationTimestamp: null name: kube-vip namespace: kube-system spec: containers: - args: - manager env: - name: cp_enable value: "true" - name: vip_interface value: ens4 - name: address value: 10.255.24.5 - name: port value: "6443" - name: vip_arp value: "true" - name: vip_leaderelection value: "true" - name: vip_leaseduration value: "15" - name: vip_renewdeadline value: "10" - name: vip_retryperiod value: "2" image: registry.smtx.io/kubesmart-e2e-test/huhui.test-template-v1.22/build191/kube-vip/kube-vip:v0.5.7 imagePullPolicy: IfNotPresent name: kube-vip resources: {} securityContext: capabilities: add: - NET_ADMIN - NET_RAW volumeMounts: - mountPath: /etc/kubernetes/admin.conf name: kubeconfig hostAliases: - hostnames: - kubernetes ip: 127.0.0.1 hostNetwork: true volumes: - hostPath: path: /etc/kubernetes/admin.conf type: FileOrCreate name: kubeconfig status: {} owner: root:root path: /etc/kubernetes/manifests/kube-vip.yaml - content: | apiVersion: apiserver.config.k8s.io/v1 kind: AdmissionConfiguration plugins: - name: EventRateLimit configuration: apiVersion: eventratelimit.admission.k8s.io/v1alpha1 kind: Configuration limits: - type: Server burst: 20000 qps: 5000 owner: root:root path: /etc/kubernetes/admission.yaml - content: | apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: None userGroups: - system:nodes - level: None users: - system:kube-scheduler - system:volume-scheduler - system:kube-controller-manager - level: None nonResourceURLs: - /healthz* - /version - /swagger* - level: Metadata resources: - resources: ["secrets", "configmaps", "tokenreviews"] - level: Metadata omitStages: - RequestReceived resources: - resources: ["pods", "deployments"] owner: root:root path: /etc/kubernetes/auditpolicy.yaml format: cloud-config initConfiguration: localAPIEndpoint: {} nodeRegistration: kubeletExtraArgs: event-qps: "0" pod-infra-container-image: registry.smtx.io/kubesmart-e2e-test/huhui.test-template-v1.22/build191/pause:3.6 protect-kernel-defaults: "true" tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256 name: '{{ ds.meta_data.hostname }}' joinConfiguration: discovery: {} nodeRegistration: kubeletExtraArgs: event-qps: "0" pod-infra-container-image: registry.smtx.io/kubesmart-e2e-test/huhui.test-template-v1.22/build191/pause:3.6 protect-kernel-defaults: "true" tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256 name: '{{ ds.meta_data.hostname }}' preKubeadmCommands: - if [ -d /etc/sysconfig/network-scripts/ ]; then - echo DEFROUTE=no | tee -a /etc/sysconfig/network-scripts/ifcfg-* > /dev/null - echo PEERDNS=no | tee -a /etc/sysconfig/network-scripts/ifcfg-* > /dev/null - sed -i '/DEFROUTE\|PEERDNS/d' /etc/sysconfig/network-scripts/ifcfg-ens4 - nmcli connection reload && nmcli networking off && nmcli networking on - fi - '[ -f "/run/kubeadm/kubeadm.yaml" ] && printf "\n---\napiVersion: kubeproxy.config.k8s.io/v1alpha1\nkind: KubeProxyConfiguration\nmetricsBindAddress: 0.0.0.0:10249\n" >> /run/kubeadm/kubeadm.yaml' - hostname "{{ ds.meta_data.hostname }}" - echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts - echo "127.0.0.1 localhost" >>/etc/hosts - echo "127.0.0.1 {{ ds.meta_data.hostname }}" >>/etc/hosts - echo "{{ ds.meta_data.hostname }}" >/etc/hostname - sed -i 's|".*/pause.*"|"registry.smtx.io/kubesmart-e2e-test/huhui.test-template-v1.22/build191/pause:3.6"|' /etc/containerd/config.toml - systemctl restart containerd - while ! systemctl status containerd; do sleep 1; done useExperimentalRetryJoin: true users: - lockPassword: false name: smartx passwd: $6$fbz1r5cI$ISJj4xg/DSGLlsIHYWFyuhrSkAtX76AH7RTRTs7h/WFbhVazjq51zNQGsuWYu1JsF1tfYCLCbo9xD3Nr8UY5w0 sudo: ALL=(ALL) NOPASSWD:ALL machineTemplate: infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: ElfMachineTemplate name: sks-e2e-test-owk3ls-k8s-upgrade-controlplane-v1-23-14-c7w63y namespace: sks-e2e-huhui-test-template-v1-22-191-vflqvz metadata: {} nodeDrainTimeout: 5m0s replicas: 3 rolloutStrategy: rollingUpdate: maxSurge: 1 type: RollingUpdate version: v1.23.14 status: conditions: - lastTransitionTime: "2023-02-01T22:22:46Z" message: Rolling 3 replicas with outdated spec (1 replicas up to date) reason: RollingUpdateInProgress severity: Warning status: "False" type: Ready - lastTransitionTime: "2023-02-01T22:12:59Z" status: "True" type: Available - lastTransitionTime: "2023-02-01T22:11:31Z" status: "True" type: CertificatesAvailable - lastTransitionTime: "2023-02-01T22:25:33Z" status: "True" type: ControlPlaneComponentsHealthy - lastTransitionTime: "2023-02-01T22:25:33Z" status: "True" type: EtcdClusterHealthy - lastTransitionTime: "2023-02-01T22:17:30Z" status: "True" type: MachinesCreated - lastTransitionTime: "2023-02-01T22:25:09Z" status: "True" type: MachinesReady - lastTransitionTime: "2023-02-01T22:22:46Z" message: Rolling 3 replicas with outdated spec (1 replicas up to date) reason: RollingUpdateInProgress severity: Warning status: "False" type: MachinesSpecUpToDate - lastTransitionTime: "2023-02-01T22:22:43Z" message: Scaling down control plane to 3 replicas (actual 4) reason: ScalingDown severity: Warning status: "False" type: Resized initialized: true observedGeneration: 2 ready: true readyReplicas: 3 replicas: 4 selector: cluster.x-k8s.io/cluster-name=sks-e2e-test-owk3ls-k8s-upgrade,cluster.x-k8s.io/control-plane unavailableReplicas: 1 updatedReplicas: 1 version: v1.22.16 ```

4 Machines(3 low version + 1 high version)

low version Machine1: ```yaml apiVersion: cluster.x-k8s.io/v1beta1 kind: Machine metadata: annotations: controlplane.cluster.x-k8s.io/kubeadm-cluster-configuration: '{"etcd":{},"networking":{},"apiServer":{"extraArgs":{"admission-control-config-file":"/etc/kubernetes/admission.yaml","audit-log-maxage":"30","audit-log-maxbackup":"10","audit-log-maxsize":"100","audit-log-path":"/var/log/apiserver/audit.log","audit-policy-file":"/etc/kubernetes/auditpolicy.yaml","enable-admission-plugins":"AlwaysPullImages,EventRateLimit","profiling":"false","request-timeout":"300s","tls-cipher-suites":"TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384"},"extraVolumes":[{"name":"apiserver-log","hostPath":"/var/log/apiserver","mountPath":"/var/log/apiserver","pathType":"DirectoryOrCreate"},{"name":"admission-config","hostPath":"/etc/kubernetes/admission.yaml","mountPath":"/etc/kubernetes/admission.yaml","readOnly":true,"pathType":"FileOrCreate"},{"name":"audit-policy","hostPath":"/etc/kubernetes/auditpolicy.yaml","mountPath":"/etc/kubernetes/auditpolicy.yaml","readOnly":true,"pathType":"FileOrCreate"}]},"controllerManager":{"extraArgs":{"allocate-node-cidrs":"false","bind-address":"0.0.0.0","profiling":"false","terminated-pod-gc-threshold":"10"}},"scheduler":{"extraArgs":{"bind-address":"0.0.0.0","profiling":"false"}},"dns":{"imageRepository":"registry.smtx.io/kubesmart-e2e-test/huhui.test-template-v1.22/build191/coredns"},"imageRepository":"registry.smtx.io/kubesmart-e2e-test/huhui.test-template-v1.22/build191","clusterName":"sks-e2e-test-owk3ls-k8s-upgrade"}' creationTimestamp: "2023-02-01T22:11:32Z" finalizers: - machine.cluster.x-k8s.io generation: 3 labels: cluster.x-k8s.io/cluster-name: sks-e2e-test-owk3ls-k8s-upgrade cluster.x-k8s.io/control-plane: "" cluster.x-k8s.io/control-plane-name: sks-e2e-test-owk3ls-k8s-upgrade-controlplane managedFields: - apiVersion: cluster.x-k8s.io/v1beta1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:controlplane.cluster.x-k8s.io/kubeadm-cluster-configuration: {} f:finalizers: .: {} v:"machine.cluster.x-k8s.io": {} f:labels: .: {} f:cluster.x-k8s.io/cluster-name: {} f:cluster.x-k8s.io/control-plane: {} f:cluster.x-k8s.io/control-plane-name: {} f:ownerReferences: .: {} k:{"uid":"07081de2-ab3b-4aa0-b2d9-99f248d74810"}: {} f:spec: .: {} f:bootstrap: .: {} f:configRef: {} f:dataSecretName: {} f:clusterName: {} f:infrastructureRef: {} f:nodeDrainTimeout: {} f:providerID: {} f:version: {} manager: manager operation: Update time: "2023-02-01T22:13:56Z" - apiVersion: cluster.x-k8s.io/v1beta1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:addresses: {} f:bootstrapReady: {} f:conditions: {} f:infrastructureReady: {} f:lastUpdated: {} f:nodeInfo: .: {} f:architecture: {} f:bootID: {} f:containerRuntimeVersion: {} f:kernelVersion: {} f:kubeProxyVersion: {} f:kubeletVersion: {} f:machineID: {} f:operatingSystem: {} f:osImage: {} f:systemUUID: {} f:nodeRef: {} f:observedGeneration: {} f:phase: {} manager: manager operation: Update subresource: status time: "2023-02-01T22:29:05Z" name: sks-e2e-test-owk3ls-k8s-upgrade-controlplane-9cg5m namespace: sks-e2e-huhui-test-template-v1-22-191-vflqvz ownerReferences: - apiVersion: controlplane.cluster.x-k8s.io/v1beta1 blockOwnerDeletion: true controller: true kind: KubeadmControlPlane name: sks-e2e-test-owk3ls-k8s-upgrade-controlplane uid: 07081de2-ab3b-4aa0-b2d9-99f248d74810 resourceVersion: "7816" uid: b7c17f00-18cd-42e6-910b-cae9a4983f5d spec: bootstrap: configRef: apiVersion: bootstrap.cluster.x-k8s.io/v1beta1 kind: KubeadmConfig name: sks-e2e-test-owk3ls-k8s-upgrade-controlplane-hx9ps namespace: sks-e2e-huhui-test-template-v1-22-191-vflqvz uid: 973f9b62-6e3b-45fb-9fab-93c18bbdd4a6 dataSecretName: sks-e2e-test-owk3ls-k8s-upgrade-controlplane-hx9ps clusterName: sks-e2e-test-owk3ls-k8s-upgrade infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: ElfMachine name: sks-e2e-test-owk3ls-k8s-upgrade-controlplane-qtrrf namespace: sks-e2e-huhui-test-template-v1-22-191-vflqvz uid: 054600bd-e693-4017-ae17-620280cf3abb nodeDeletionTimeout: 10s nodeDrainTimeout: 5m0s providerID: elf://6b4dfb57-e3fb-4d67-bf5c-c5d87d92eb50 version: v1.22.16 status: addresses: - address: 10.255.1.149 type: InternalIP - address: 240.255.0.1 type: InternalIP - address: 10.244.7.146 type: InternalIP - address: 100.64.254.254 type: InternalIP - address: 10.255.35.0 type: InternalIP - address: 100.64.254.254 type: InternalIP - address: 10.255.35.1 type: InternalIP - address: 100.64.254.254 type: InternalIP - address: 10.255.35.9 type: InternalIP - address: 100.64.254.254 type: InternalIP - address: 10.255.35.34 type: InternalIP bootstrapReady: true conditions: - lastTransitionTime: "2023-02-01T22:13:56Z" status: "True" type: Ready - lastTransitionTime: "2023-02-01T22:17:13Z" status: "True" type: APIServerPodHealthy - lastTransitionTime: "2023-02-01T22:11:32Z" status: "True" type: BootstrapReady - lastTransitionTime: "2023-02-01T22:13:57Z" status: "True" type: ControllerManagerPodHealthy - lastTransitionTime: "2023-02-01T22:13:57Z" status: "True" type: EtcdMemberHealthy - lastTransitionTime: "2023-02-01T22:13:57Z" status: "True" type: EtcdPodHealthy - lastTransitionTime: "2023-02-01T22:13:56Z" status: "True" type: InfrastructureReady - lastTransitionTime: "2023-02-01T22:13:56Z" status: "True" type: NodeHealthy - lastTransitionTime: "2023-02-01T22:13:57Z" status: "True" type: SchedulerPodHealthy infrastructureReady: true lastUpdated: "2023-02-01T22:13:56Z" nodeInfo: architecture: amd64 bootID: db973ed3-7a84-480c-a62f-4f0f4d3893e6 containerRuntimeVersion: containerd://1.6.4 kernelVersion: 4.18.0-372.9.1.el8.x86_64 kubeProxyVersion: v1.22.16 kubeletVersion: v1.22.16 machineID: 89e1bbf094cb41519341c3040fbef924 operatingSystem: linux osImage: Rocky Linux 8.6 (Green Obsidian) systemUUID: 89e1bbf0-94cb-4151-9341-c3040fbef924 nodeRef: apiVersion: v1 kind: Node name: sks-e2e-test-owk3ls-k8s-upgrade-controlplane-qtrrf uid: c9e33a7c-e9e3-4fee-9569-70906836fe6c observedGeneration: 3 phase: Running ``` low version Machine2: ```yaml ``` low version Machine3: ```yaml ``` high version Machine: ```yaml ```

etcd member

``` +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://10.255.1.149:2379 | 224b96420b3e1b24 | 3.5.0 | 35 MB | true | false | 3 | 260058 | 260058 | | | https://10.255.2.106:2379 | 7dec8de804d9d3c1 | 3.5.5 | 34 MB | false | false | 3 | 260058 | 260058 | | | https://10.255.1.177:2379 | 9aee0fa65b787fc9 | 3.5.0 | 34 MB | false | false | 3 | 260058 | 260058 | | | https://10.255.2.18:2379 | af37f0689bf2cfc4 | 3.5.0 | 34 MB | false | false | 3 | 260058 | 260058 | | +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ ```

the etcd leader log

``` {"level":"info","ts":"2023-02-01T22:25:47.374Z","caller":"traceutil/trace.go:171","msg":"trace[408559093] transaction","detail":"{read_only:false; response_revision:7618; number_of_response:1; }","duration":"119.081979ms","start":"2023-02-01T22:25:47.255Z","end":"2023-02-01T22:25:47.374Z","steps":["trace[408559093] 'process raft request' (duration: 84.44308ms)","trace[408559093] 'compare' (duration: 34.029498ms)"],"step_count":2} {"level":"info","ts":"2023-02-01T22:25:48.876Z","caller":"etcdserver/server.go:1398","msg":"leadership transfer starting","local-member-id":"224b96420b3e1b24","current-leader-member-id":"224b96420b3e1b24","transferee-member-id":"7dec8de804d9d3c1"} {"level":"info","ts":"2023-02-01T22:25:48.876Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"224b96420b3e1b24 [term 3] starts to transfer leadership to 7dec8de804d9d3c1"} {"level":"warn","ts":"2023-02-01T22:25:49.141Z","caller":"etcdserver/server.go:1130","msg":"failed to revoke lease","lease-id":"53c1860f15b1881b","error":"raft proposal dropped"} {"level":"warn","ts":"2023-02-01T22:25:49.141Z","caller":"etcdserver/server.go:1130","msg":"failed to revoke lease","lease-id":"7fc9860f0e0e45da","error":"raft proposal dropped"} {"level":"warn","ts":"2023-02-01T22:25:50.278Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:25:54.281Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:25:54.711Z","caller":"traceutil/trace.go:171","msg":"trace[2126905403] transaction","detail":"{read_only:false; number_of_response:0; response_revision:7680; }","duration":"103.675532ms","start":"2023-02-01T22:25:54.608Z","end":"2023-02-01T22:25:54.711Z","steps":["trace[2126905403] 'process raft request' (duration: 103.470103ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:25:58.286Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:25:58.531Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"126.824875ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/pods/kube-system/ecp-agent-s8mdq\" ","response":"range_response_count:1 size:6963"} {"level":"info","ts":"2023-02-01T22:25:58.531Z","caller":"traceutil/trace.go:171","msg":"trace[689488468] range","detail":"{range_begin:/registry/pods/kube-system/ecp-agent-s8mdq; range_end:; response_count:1; response_revision:7718; }","duration":"126.952099ms","start":"2023-02-01T22:25:58.404Z","end":"2023-02-01T22:25:58.531Z","steps":["trace[689488468] 'agreement among raft nodes before linearized reading' (duration: 126.759114ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:02.290Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:26:06.294Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:26:06.809Z","caller":"traceutil/trace.go:171","msg":"trace[1814538697] transaction","detail":"{read_only:false; response_revision:7790; number_of_response:1; }","duration":"147.964398ms","start":"2023-02-01T22:26:06.661Z","end":"2023-02-01T22:26:06.809Z","steps":["trace[1814538697] 'process raft request' (duration: 53.737143ms)","trace[1814538697] 'compare' (duration: 94.068255ms)"],"step_count":2} {"level":"info","ts":"2023-02-01T22:26:08.635Z","caller":"traceutil/trace.go:171","msg":"trace[1507755063] transaction","detail":"{read_only:false; response_revision:7796; number_of_response:1; }","duration":"153.940219ms","start":"2023-02-01T22:26:08.481Z","end":"2023-02-01T22:26:08.635Z","steps":["trace[1507755063] 'process raft request' (duration: 153.7693ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:10.298Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:26:14.306Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:26:18.312Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:26:20.410Z","caller":"traceutil/trace.go:171","msg":"trace[1090351540] linearizableReadLoop","detail":"{readStateIndex:9043; appliedIndex:9043; }","duration":"100.675717ms","start":"2023-02-01T22:26:20.309Z","end":"2023-02-01T22:26:20.410Z","steps":["trace[1090351540] 'read index received' (duration: 100.670472ms)","trace[1090351540] 'applied index is now lower than readState.Index' (duration: 3.33µs)"],"step_count":2} {"level":"info","ts":"2023-02-01T22:26:20.410Z","caller":"traceutil/trace.go:171","msg":"trace[1938834446] transaction","detail":"{read_only:false; response_revision:7883; number_of_response:1; }","duration":"104.937153ms","start":"2023-02-01T22:26:20.305Z","end":"2023-02-01T22:26:20.410Z","steps":["trace[1938834446] 'process raft request' (duration: 104.72741ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:20.410Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"101.021426ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/health\" ","response":"range_response_count:0 size:5"} {"level":"info","ts":"2023-02-01T22:26:20.411Z","caller":"traceutil/trace.go:171","msg":"trace[627460088] range","detail":"{range_begin:/registry/health; range_end:; response_count:0; response_revision:7883; }","duration":"101.361459ms","start":"2023-02-01T22:26:20.309Z","end":"2023-02-01T22:26:20.411Z","steps":["trace[627460088] 'agreement among raft nodes before linearized reading' (duration: 100.920503ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:20.651Z","caller":"traceutil/trace.go:171","msg":"trace[1878359730] transaction","detail":"{read_only:false; response_revision:7885; number_of_response:1; }","duration":"118.677494ms","start":"2023-02-01T22:26:20.532Z","end":"2023-02-01T22:26:20.651Z","steps":["trace[1878359730] 'process raft request' (duration: 118.340079ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:20.910Z","caller":"traceutil/trace.go:171","msg":"trace[727615520] transaction","detail":"{read_only:false; response_revision:7887; number_of_response:1; }","duration":"246.831763ms","start":"2023-02-01T22:26:20.663Z","end":"2023-02-01T22:26:20.910Z","steps":["trace[727615520] 'process raft request' (duration: 246.608132ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:21.010Z","caller":"traceutil/trace.go:171","msg":"trace[1365621629] transaction","detail":"{read_only:false; response_revision:7888; number_of_response:1; }","duration":"224.62174ms","start":"2023-02-01T22:26:20.786Z","end":"2023-02-01T22:26:21.010Z","steps":["trace[1365621629] 'process raft request' (duration: 224.409399ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:21.264Z","caller":"traceutil/trace.go:171","msg":"trace[927596763] linearizableReadLoop","detail":"{readStateIndex:9050; appliedIndex:9050; }","duration":"112.974198ms","start":"2023-02-01T22:26:21.151Z","end":"2023-02-01T22:26:21.264Z","steps":["trace[927596763] 'read index received' (duration: 112.966154ms)","trace[927596763] 'applied index is now lower than readState.Index' (duration: 6.806µs)"],"step_count":2} {"level":"warn","ts":"2023-02-01T22:26:21.264Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"113.210732ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/configmaps/sks-system-logging/92b05146.kubesmart.smtx.io\" ","response":"range_response_count:1 size:626"} {"level":"info","ts":"2023-02-01T22:26:21.264Z","caller":"traceutil/trace.go:171","msg":"trace[321391868] range","detail":"{range_begin:/registry/configmaps/sks-system-logging/92b05146.kubesmart.smtx.io; range_end:; response_count:1; response_revision:7889; }","duration":"113.319894ms","start":"2023-02-01T22:26:21.151Z","end":"2023-02-01T22:26:21.264Z","steps":["trace[321391868] 'agreement among raft nodes before linearized reading' (duration: 113.135439ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:21.264Z","caller":"traceutil/trace.go:171","msg":"trace[1170037460] transaction","detail":"{read_only:false; response_revision:7890; number_of_response:1; }","duration":"136.68946ms","start":"2023-02-01T22:26:21.127Z","end":"2023-02-01T22:26:21.264Z","steps":["trace[1170037460] 'process raft request' (duration: 136.488833ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:21.420Z","caller":"traceutil/trace.go:171","msg":"trace[903375563] transaction","detail":"{read_only:false; response_revision:7891; number_of_response:1; }","duration":"149.366209ms","start":"2023-02-01T22:26:21.271Z","end":"2023-02-01T22:26:21.420Z","steps":["trace[903375563] 'process raft request' (duration: 108.144398ms)","trace[903375563] 'compare' (duration: 41.082086ms)"],"step_count":2} {"level":"warn","ts":"2023-02-01T22:26:21.614Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"177.908165ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/leases/kube-system/snapshot-controller-leader\" ","response":"range_response_count:1 size:500"} {"level":"info","ts":"2023-02-01T22:26:21.615Z","caller":"traceutil/trace.go:171","msg":"trace[414573502] range","detail":"{range_begin:/registry/leases/kube-system/snapshot-controller-leader; range_end:; response_count:1; response_revision:7891; }","duration":"178.036965ms","start":"2023-02-01T22:26:21.436Z","end":"2023-02-01T22:26:21.615Z","steps":["trace[414573502] 'agreement among raft nodes before linearized reading' (duration: 95.287081ms)","trace[414573502] 'range keys from in-memory index tree' (duration: 82.539481ms)"],"step_count":2} {"level":"warn","ts":"2023-02-01T22:26:21.615Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"120.832593ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/leases/sks-system/external-resizer-com-smartx-zbs-csi-driver\" ","response":"range_response_count:1 size:531"} {"level":"info","ts":"2023-02-01T22:26:21.615Z","caller":"traceutil/trace.go:171","msg":"trace[534010467] range","detail":"{range_begin:/registry/leases/sks-system/external-resizer-com-smartx-zbs-csi-driver; range_end:; response_count:1; response_revision:7893; }","duration":"120.897173ms","start":"2023-02-01T22:26:21.494Z","end":"2023-02-01T22:26:21.615Z","steps":["trace[534010467] 'agreement among raft nodes before linearized reading' (duration: 120.783144ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:21.615Z","caller":"traceutil/trace.go:171","msg":"trace[947492628] transaction","detail":"{read_only:false; response_revision:7892; number_of_response:1; }","duration":"342.031603ms","start":"2023-02-01T22:26:21.273Z","end":"2023-02-01T22:26:21.615Z","steps":["trace[947492628] 'process raft request' (duration: 259.023409ms)","trace[947492628] 'compare' (duration: 82.41945ms)"],"step_count":2} {"level":"info","ts":"2023-02-01T22:26:21.615Z","caller":"traceutil/trace.go:171","msg":"trace[565397956] transaction","detail":"{read_only:false; response_revision:7893; number_of_response:1; }","duration":"341.157023ms","start":"2023-02-01T22:26:21.274Z","end":"2023-02-01T22:26:21.615Z","steps":["trace[565397956] 'process raft request' (duration: 340.737783ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:21.615Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2023-02-01T22:26:21.274Z","time spent":"341.225415ms","remote":"127.0.0.1:40136","response type":"/etcdserverpb.KV/Txn","request count":1,"request size":587,"response count":0,"response size":39,"request content":"compare: success:> failure: >"} {"level":"warn","ts":"2023-02-01T22:26:21.615Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2023-02-01T22:26:21.273Z","time spent":"342.107584ms","remote":"127.0.0.1:40136","response type":"/etcdserverpb.KV/Txn","request count":1,"request size":610,"response count":0,"response size":39,"request content":"compare: success:> failure: >"} {"level":"info","ts":"2023-02-01T22:26:21.844Z","caller":"traceutil/trace.go:171","msg":"trace[1970129712] linearizableReadLoop","detail":"{readStateIndex:9054; appliedIndex:9055; }","duration":"226.909152ms","start":"2023-02-01T22:26:21.617Z","end":"2023-02-01T22:26:21.844Z","steps":["trace[1970129712] 'read index received' (duration: 226.903778ms)","trace[1970129712] 'applied index is now lower than readState.Index' (duration: 4.294µs)"],"step_count":2} {"level":"warn","ts":"2023-02-01T22:26:21.844Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"227.274646ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/leases/sks-system/external-attacher-leader-com-smartx-zbs-csi-driver\" ","response":"range_response_count:1 size:548"} {"level":"info","ts":"2023-02-01T22:26:21.844Z","caller":"traceutil/trace.go:171","msg":"trace[1279604414] range","detail":"{range_begin:/registry/leases/sks-system/external-attacher-leader-com-smartx-zbs-csi-driver; range_end:; response_count:1; response_revision:7894; }","duration":"227.336487ms","start":"2023-02-01T22:26:21.617Z","end":"2023-02-01T22:26:21.844Z","steps":["trace[1279604414] 'agreement among raft nodes before linearized reading' (duration: 227.210926ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:21.845Z","caller":"traceutil/trace.go:171","msg":"trace[1011214221] transaction","detail":"{read_only:false; response_revision:7895; number_of_response:1; }","duration":"223.331238ms","start":"2023-02-01T22:26:21.621Z","end":"2023-02-01T22:26:21.845Z","steps":["trace[1011214221] 'process raft request' (duration: 223.187149ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:21.895Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"275.861172ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/leases/sks-system/csi-controllers\" ","response":"range_response_count:1 size:510"} {"level":"info","ts":"2023-02-01T22:26:21.895Z","caller":"traceutil/trace.go:171","msg":"trace[1033785113] range","detail":"{range_begin:/registry/leases/sks-system/csi-controllers; range_end:; response_count:1; response_revision:7895; }","duration":"275.984353ms","start":"2023-02-01T22:26:21.619Z","end":"2023-02-01T22:26:21.895Z","steps":["trace[1033785113] 'agreement among raft nodes before linearized reading' (duration: 275.779195ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:21.895Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"275.180234ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/leases/sks-system-logging/92b05146.kubesmart.smtx.io\" ","response":"range_response_count:1 size:534"} {"level":"info","ts":"2023-02-01T22:26:21.895Z","caller":"traceutil/trace.go:171","msg":"trace[337184822] range","detail":"{range_begin:/registry/leases/sks-system-logging/92b05146.kubesmart.smtx.io; range_end:; response_count:1; response_revision:7895; }","duration":"275.276972ms","start":"2023-02-01T22:26:21.620Z","end":"2023-02-01T22:26:21.895Z","steps":["trace[337184822] 'agreement among raft nodes before linearized reading' (duration: 275.12428ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:21.895Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"192.229573ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/leases/sks-system/external-resizer-com-smartx-zbs-csi-driver\" ","response":"range_response_count:1 size:531"} {"level":"info","ts":"2023-02-01T22:26:21.895Z","caller":"traceutil/trace.go:171","msg":"trace[1262968141] range","detail":"{range_begin:/registry/leases/sks-system/external-resizer-com-smartx-zbs-csi-driver; range_end:; response_count:1; response_revision:7895; }","duration":"192.411586ms","start":"2023-02-01T22:26:21.703Z","end":"2023-02-01T22:26:21.895Z","steps":["trace[1262968141] 'agreement among raft nodes before linearized reading' (duration: 192.129604ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:21.895Z","caller":"traceutil/trace.go:171","msg":"trace[776024769] transaction","detail":"{read_only:false; response_revision:7896; number_of_response:1; }","duration":"273.251206ms","start":"2023-02-01T22:26:21.622Z","end":"2023-02-01T22:26:21.895Z","steps":["trace[776024769] 'process raft request' (duration: 272.852779ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:22.048Z","caller":"traceutil/trace.go:171","msg":"trace[1668955994] transaction","detail":"{read_only:false; response_revision:7897; number_of_response:1; }","duration":"147.535936ms","start":"2023-02-01T22:26:21.900Z","end":"2023-02-01T22:26:22.048Z","steps":["trace[1668955994] 'process raft request' (duration: 147.354775ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:22.141Z","caller":"traceutil/trace.go:171","msg":"trace[1718906765] transaction","detail":"{read_only:false; response_revision:7898; number_of_response:1; }","duration":"240.430111ms","start":"2023-02-01T22:26:21.900Z","end":"2023-02-01T22:26:22.141Z","steps":["trace[1718906765] 'process raft request' (duration: 240.295128ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:22.147Z","caller":"traceutil/trace.go:171","msg":"trace[840798881] linearizableReadLoop","detail":"{readStateIndex:9058; appliedIndex:9060; }","duration":"180.270946ms","start":"2023-02-01T22:26:21.967Z","end":"2023-02-01T22:26:22.147Z","steps":["trace[840798881] 'read index received' (duration: 180.259419ms)","trace[840798881] 'applied index is now lower than readState.Index' (duration: 9.506µs)"],"step_count":2} {"level":"warn","ts":"2023-02-01T22:26:22.148Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"181.357948ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/leases/sks-system-contour/sks-system-contour\" ","response":"range_response_count:1 size:520"} {"level":"info","ts":"2023-02-01T22:26:22.148Z","caller":"traceutil/trace.go:171","msg":"trace[301255454] range","detail":"{range_begin:/registry/leases/sks-system-contour/sks-system-contour; range_end:; response_count:1; response_revision:7898; }","duration":"181.465855ms","start":"2023-02-01T22:26:21.967Z","end":"2023-02-01T22:26:22.148Z","steps":["trace[301255454] 'agreement among raft nodes before linearized reading' (duration: 180.464223ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:22.185Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"181.013619ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/configmaps/sks-system/csi-controllers\" ","response":"range_response_count:1 size:603"} {"level":"warn","ts":"2023-02-01T22:26:22.185Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"131.061774ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/health\" ","response":"range_response_count:0 size:5"} {"level":"info","ts":"2023-02-01T22:26:22.185Z","caller":"traceutil/trace.go:171","msg":"trace[414470229] range","detail":"{range_begin:/registry/configmaps/sks-system/csi-controllers; range_end:; response_count:1; response_revision:7898; }","duration":"181.128354ms","start":"2023-02-01T22:26:22.004Z","end":"2023-02-01T22:26:22.185Z","steps":["trace[414470229] 'agreement among raft nodes before linearized reading' (duration: 180.939788ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:22.185Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"142.928662ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/projectcontour.io/httpproxies/\" range_end:\"/registry/projectcontour.io/httpproxies0\" count_only:true ","response":"range_response_count:0 size:5"} {"level":"info","ts":"2023-02-01T22:26:22.185Z","caller":"traceutil/trace.go:171","msg":"trace[1793247804] range","detail":"{range_begin:/registry/projectcontour.io/httpproxies/; range_end:/registry/projectcontour.io/httpproxies0; response_count:0; response_revision:7898; }","duration":"143.026351ms","start":"2023-02-01T22:26:22.042Z","end":"2023-02-01T22:26:22.185Z","steps":["trace[1793247804] 'agreement among raft nodes before linearized reading' (duration: 142.874858ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:22.185Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"162.060034ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/leases/kube-system/kube-controller-manager\" ","response":"range_response_count:1 size:551"} {"level":"info","ts":"2023-02-01T22:26:22.185Z","caller":"traceutil/trace.go:171","msg":"trace[1383091566] range","detail":"{range_begin:/registry/leases/kube-system/kube-controller-manager; range_end:; response_count:1; response_revision:7899; }","duration":"162.11236ms","start":"2023-02-01T22:26:22.023Z","end":"2023-02-01T22:26:22.185Z","steps":["trace[1383091566] 'agreement among raft nodes before linearized reading' (duration: 162.005199ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:22.185Z","caller":"traceutil/trace.go:171","msg":"trace[1949443283] range","detail":"{range_begin:/registry/health; range_end:; response_count:0; response_revision:7898; }","duration":"131.121967ms","start":"2023-02-01T22:26:22.054Z","end":"2023-02-01T22:26:22.185Z","steps":["trace[1949443283] 'agreement among raft nodes before linearized reading' (duration: 131.009147ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:22.315Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:26:26.319Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:26:29.029Z","caller":"traceutil/trace.go:171","msg":"trace[195245980] transaction","detail":"{read_only:false; response_revision:7954; number_of_response:1; }","duration":"102.285147ms","start":"2023-02-01T22:26:28.926Z","end":"2023-02-01T22:26:29.029Z","steps":["trace[195245980] 'process raft request' (duration: 102.146273ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:30.323Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:26:31.584Z","caller":"traceutil/trace.go:171","msg":"trace[1551620095] transaction","detail":"{read_only:false; response_revision:7972; number_of_response:1; }","duration":"113.033152ms","start":"2023-02-01T22:26:31.471Z","end":"2023-02-01T22:26:31.584Z","steps":["trace[1551620095] 'process raft request' (duration: 112.865248ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:34.110Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"109.378483ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/configmaps/sks-system/csi-controllers\" ","response":"range_response_count:1 size:603"} {"level":"info","ts":"2023-02-01T22:26:34.110Z","caller":"traceutil/trace.go:171","msg":"trace[2104292027] range","detail":"{range_begin:/registry/configmaps/sks-system/csi-controllers; range_end:; response_count:1; response_revision:7995; }","duration":"109.536362ms","start":"2023-02-01T22:26:34.001Z","end":"2023-02-01T22:26:34.110Z","steps":["trace[2104292027] 'agreement among raft nodes before linearized reading' (duration: 34.843565ms)","trace[2104292027] 'range keys from in-memory index tree' (duration: 74.457913ms)"],"step_count":2} {"level":"info","ts":"2023-02-01T22:26:34.111Z","caller":"traceutil/trace.go:171","msg":"trace[1409590447] transaction","detail":"{read_only:false; response_revision:7996; number_of_response:1; }","duration":"137.632674ms","start":"2023-02-01T22:26:33.973Z","end":"2023-02-01T22:26:34.111Z","steps":["trace[1409590447] 'process raft request' (duration: 62.736776ms)","trace[1409590447] 'compare' (duration: 74.244636ms)"],"step_count":2} {"level":"warn","ts":"2023-02-01T22:26:34.328Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:26:38.334Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:26:42.338Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:26:43.306Z","caller":"traceutil/trace.go:171","msg":"trace[563087535] transaction","detail":"{read_only:false; response_revision:8068; number_of_response:1; }","duration":"104.336903ms","start":"2023-02-01T22:26:43.202Z","end":"2023-02-01T22:26:43.306Z","steps":["trace[563087535] 'process raft request' (duration: 104.282121ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:43.345Z","caller":"traceutil/trace.go:171","msg":"trace[6227443] linearizableReadLoop","detail":"{readStateIndex:9253; appliedIndex:9256; }","duration":"131.91291ms","start":"2023-02-01T22:26:43.213Z","end":"2023-02-01T22:26:43.345Z","steps":["trace[6227443] 'read index received' (duration: 131.905746ms)","trace[6227443] 'applied index is now lower than readState.Index' (duration: 5.928µs)"],"step_count":2} {"level":"warn","ts":"2023-02-01T22:26:43.345Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"132.339044ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/kappctrl.k14s.io/apps/sks-system/metallb\" ","response":"range_response_count:1 size:5344"} {"level":"info","ts":"2023-02-01T22:26:43.345Z","caller":"traceutil/trace.go:171","msg":"trace[425867138] range","detail":"{range_begin:/registry/kappctrl.k14s.io/apps/sks-system/metallb; range_end:; response_count:1; response_revision:8068; }","duration":"132.439841ms","start":"2023-02-01T22:26:43.213Z","end":"2023-02-01T22:26:43.345Z","steps":["trace[425867138] 'agreement among raft nodes before linearized reading' (duration: 132.274997ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:43.346Z","caller":"traceutil/trace.go:171","msg":"trace[1122914730] transaction","detail":"{read_only:false; number_of_response:1; response_revision:8068; }","duration":"132.415722ms","start":"2023-02-01T22:26:43.213Z","end":"2023-02-01T22:26:43.346Z","steps":["trace[1122914730] 'process raft request' (duration: 132.241928ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:43.530Z","caller":"traceutil/trace.go:171","msg":"trace[1099853804] transaction","detail":"{read_only:false; response_revision:8070; number_of_response:1; }","duration":"171.185064ms","start":"2023-02-01T22:26:43.359Z","end":"2023-02-01T22:26:43.530Z","steps":["trace[1099853804] 'process raft request' (duration: 168.962713ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:26:43.544Z","caller":"traceutil/trace.go:171","msg":"trace[1751610053] transaction","detail":"{read_only:false; number_of_response:1; response_revision:8071; }","duration":"172.423114ms","start":"2023-02-01T22:26:43.372Z","end":"2023-02-01T22:26:43.544Z","steps":["trace[1751610053] 'process raft request' (duration: 172.350243ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:46.341Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:26:50.345Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:26:54.351Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:26:56.151Z","caller":"traceutil/trace.go:171","msg":"trace[1538674735] transaction","detail":"{read_only:false; response_revision:8163; number_of_response:1; }","duration":"103.988436ms","start":"2023-02-01T22:26:56.047Z","end":"2023-02-01T22:26:56.151Z","steps":["trace[1538674735] 'process raft request' (duration: 103.854629ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:26:58.354Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:02.358Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:06.361Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:10.365Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:14.368Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:18.372Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:22.378Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:26.386Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:30.390Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:34.394Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:38.400Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:27:39.387Z","caller":"mvcc/index.go:214","msg":"compact tree index","revision":6333} {"level":"info","ts":"2023-02-01T22:27:39.531Z","caller":"mvcc/kvstore_compaction.go:57","msg":"finished scheduled compaction","compact-revision":6333,"took":"141.425015ms"} {"level":"info","ts":"2023-02-01T22:27:40.785Z","caller":"traceutil/trace.go:171","msg":"trace[1051931597] transaction","detail":"{read_only:false; response_revision:8446; number_of_response:1; }","duration":"100.494929ms","start":"2023-02-01T22:27:40.684Z","end":"2023-02-01T22:27:40.785Z","steps":["trace[1051931597] 'process raft request' (duration: 100.346824ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:27:42.404Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:46.408Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:50.413Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:54.420Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:27:58.423Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:02.429Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:06.433Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:10.438Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:14.442Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:18.446Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:22.451Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:28:25.462Z","caller":"etcdserver/server.go:1368","msg":"triggering snapshot","local-member-id":"224b96420b3e1b24","local-member-applied-index":10001,"local-member-snapshot-index":0,"local-member-snapshot-count":10000} {"level":"info","ts":"2023-02-01T22:28:25.478Z","caller":"etcdserver/server.go:2363","msg":"saved snapshot","snapshot-index":10001} {"level":"info","ts":"2023-02-01T22:28:25.479Z","caller":"etcdserver/server.go:2393","msg":"compacted Raft logs","compact-index":5001} {"level":"warn","ts":"2023-02-01T22:28:26.460Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:28:30.146Z","caller":"traceutil/trace.go:171","msg":"trace[690390570] transaction","detail":"{read_only:false; response_revision:8743; number_of_response:1; }","duration":"107.843477ms","start":"2023-02-01T22:28:30.038Z","end":"2023-02-01T22:28:30.146Z","steps":["trace[690390570] 'process raft request' (duration: 107.686776ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:28:30.464Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:28:31.228Z","caller":"traceutil/trace.go:171","msg":"trace[1131114141] transaction","detail":"{read_only:false; response_revision:8746; number_of_response:1; }","duration":"189.164812ms","start":"2023-02-01T22:28:31.038Z","end":"2023-02-01T22:28:31.227Z","steps":["trace[1131114141] 'process raft request' (duration: 188.978461ms)"],"step_count":1} {"level":"info","ts":"2023-02-01T22:28:31.229Z","caller":"traceutil/trace.go:171","msg":"trace[1327734573] linearizableReadLoop","detail":"{readStateIndex:10033; appliedIndex:10034; }","duration":"174.484466ms","start":"2023-02-01T22:28:31.054Z","end":"2023-02-01T22:28:31.229Z","steps":["trace[1327734573] 'read index received' (duration: 174.478996ms)","trace[1327734573] 'applied index is now lower than readState.Index' (duration: 4.284µs)"],"step_count":2} {"level":"warn","ts":"2023-02-01T22:28:31.229Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"174.71098ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/health\" ","response":"range_response_count:0 size:5"} {"level":"info","ts":"2023-02-01T22:28:31.229Z","caller":"traceutil/trace.go:171","msg":"trace[1819157766] range","detail":"{range_begin:/registry/health; range_end:; response_count:0; response_revision:8746; }","duration":"174.861436ms","start":"2023-02-01T22:28:31.054Z","end":"2023-02-01T22:28:31.229Z","steps":["trace[1819157766] 'agreement among raft nodes before linearized reading' (duration: 174.66306ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:28:34.469Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:38.474Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:42.478Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:46.481Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:50.486Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:54.490Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:28:58.493Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:02.498Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:06.502Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:29:10.049Z","caller":"traceutil/trace.go:171","msg":"trace[1428716975] transaction","detail":"{read_only:false; response_revision:8981; number_of_response:1; }","duration":"121.694558ms","start":"2023-02-01T22:29:09.927Z","end":"2023-02-01T22:29:10.049Z","steps":["trace[1428716975] 'process raft request' (duration: 121.522937ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:29:10.505Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:14.511Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:18.514Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:22.519Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:26.523Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:30.527Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:34.532Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:38.537Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:42.540Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:46.546Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:50.550Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:54.553Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:29:58.558Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:02.566Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:06.610Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:10.617Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:14.624Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:18.628Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:22.637Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:26.641Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:30.648Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:34.652Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:38.656Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:42.662Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"info","ts":"2023-02-01T22:30:43.944Z","caller":"traceutil/trace.go:171","msg":"trace[51415947] transaction","detail":"{read_only:false; response_revision:9573; number_of_response:1; }","duration":"107.205899ms","start":"2023-02-01T22:30:43.836Z","end":"2023-02-01T22:30:43.944Z","steps":["trace[51415947] 'process raft request' (duration: 106.776182ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:30:46.669Z","caller":"etcdserver/cluster_util.go:186","msg":"leader found higher-versioned member","local-member-version":"3.5.0","remote-member-id":"7dec8de804d9d3c1","remote-member-version":"3.5.5"} {"level":"warn","ts":"2023-02-01T22:30:47.157Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"124.100931ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/validatingwebhookconfigurations/metallb-webhook-configuration\" ","response":"range_response_count:1 size:17427"} {"level":"info","ts":"2023-02-01T22:30:47.157Z","caller":"traceutil/trace.go:171","msg":"trace[694588298] range","detail":"{range_begin:/registry/validatingwebhookconfigurations/metallb-webhook-configuration; range_end:; response_count:1; response_revision:9600; }","duration":"124.318845ms","start":"2023-02-01T22:30:47.033Z","end":"2023-02-01T22:30:47.157Z","steps":["trace[694588298] 'agreement among raft nodes before linearized reading' (duration: 124.015173ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:30:47.158Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"103.334104ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/health\" ","response":"range_response_count:0 size:5"} {"level":"info","ts":"2023-02-01T22:30:47.158Z","caller":"traceutil/trace.go:171","msg":"trace[1554421155] range","detail":"{range_begin:/registry/health; range_end:; response_count:0; response_revision:9601; }","duration":"103.456476ms","start":"2023-02-01T22:30:47.054Z","end":"2023-02-01T22:30:47.158Z","steps":["trace[1554421155] 'agreement among raft nodes before linearized reading' (duration: 103.285625ms)"],"step_count":1} {"level":"warn","ts":"2023-02-01T22:30:47.158Z","caller":"etcdserver/util.go:166","msg":"apply request took too long","took":"120.132689ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/rolebindings/\" range_end:\"/registry/rolebindings0\" count_only:true ","response":"range_response_count:0 size:7"} {"level":"info","ts":"2023-02-01T22:30:47.158Z","caller":"traceutil/trace.go:171","msg":"trace[974329754] range","detail":"{range_begin:/registry/rolebindings/; range_end:/registry/rolebindings0; response_count:0; response_revision:9601; }","duration":"120.193985ms","start":"2023-02-01T22:30:47.037Z","end":"2023-02-01T22:30:47.158Z","steps":["trace[974329754] 'agreement among raft nodes before linearized reading' (duration: 120.082093ms)"],"step_count":1} ```

I see that the etcd leader has received the request of transfering leadership. But I don't see any relevant logs of executing about it. In fact, leadership has not been transfered. If the transfer fails this time, why does kcp-controller call MoveLeader() not return failure, but get stuck? I'm wondering about it.

chrischdi commented 1 year ago

For finding the real root-cause I think it would be required to go to the etcd level and try to debug it on that side (e.g. see if the leader could get moved manually or so, check if the disk speed is enough to run etcd or this could be a cause).

From CAPI perspective: I think #7841 could already improve on this particular issue. However, it would be interesting if it solves this issue by retrying the leader move (which would happen via #7841 if I understood it correctly, at least the KCP controller would surface a timeout error and retry).

jelmersnoeck commented 1 year ago

As an FYI, going back through some of these occurrences of this, we've seen etcd client calls block in multiple spots. The initial issue we ran into was on the etcd members call.

fabriziopandini commented 1 year ago

/triage needs-information @jelmersnoeck any update on this investigation?

jelmersnoeck commented 1 year ago

@fabriziopandini we didn't find anything pointing us to a concrete issue and haven't investigated this further. With the etcd client timeouts in place, this hasn't come up for us again.

I'm not sure if @huaqing1994 has found any further leads?

fabriziopandini commented 1 year ago

thanks for the feedback /close

we can eventually re-open or open a new issue if new evidence are collected

k8s-ci-robot commented 1 year ago

@fabriziopandini: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api/issues/8106#issuecomment-1491700209): >thanks for the feedback >/close > >we can eventually re-open or open a new issue if new evidence are collected Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.