Closed dmitry-mightydevops closed 1 year ago
/kind bug
What happened?
I have a node labeled etcd=true in us-west-2a running etcd as statefulset inside the eks 1.21 I have provisioned aws ebs csi driver with
etcd=true
however I always get PVC/PV in us-west-2b region AccessibleTopology:[segments:<key:"topology.ebs.csi.aws.com/zone" value:"us-west-2b" >
AccessibleTopology:[segments:<key:"topology.ebs.csi.aws.com/zone" value:"us-west-2b" >
as a result my etcd-0 pod is always in pending state
➜ kdno -l etcd Name: ip-10-110-2-122.us-west-2.compute.internal Roles: <none> Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=t3.medium beta.kubernetes.io/os=linux databases=true efs=true etcd=true failure-domain.beta.kubernetes.io/region=us-west-2 failure-domain.beta.kubernetes.io/zone=us-west-2a k8s.io/cloud-provider-aws=a5c5e390134d813e05190dc61b3f53b6 kubernetes.io/arch=amd64 kubernetes.io/hostname=ip-10-110-2-122.us-west-2.compute.internal kubernetes.io/os=linux node.kubernetes.io/instance-type=t3.medium node.kubernetes.io/lifecycle=on-demand topology.ebs.csi.aws.com/zone=us-west-2a topology.kubernetes.io/region=us-west-2 topology.kubernetes.io/zone=us-west-2a workload=databases Annotations: csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-0b24bb6849f331fd7"} node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Wed, 02 Nov 2022 16:45:36 -0500 Taints: <none> Unschedulable: false Lease: HolderIdentity: ip-10-110-2-122.us-west-2.compute.internal AcquireTime: <unset> RenewTime: Sun, 06 Nov 2022 16:54:56 -0600 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Sun, 06 Nov 2022 16:50:34 -0600 Wed, 02 Nov 2022 16:45:36 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Sun, 06 Nov 2022 16:50:34 -0600 Wed, 02 Nov 2022 16:45:36 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Sun, 06 Nov 2022 16:50:34 -0600 Wed, 02 Nov 2022 16:45:36 -0500 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Sun, 06 Nov 2022 16:50:34 -0600 Wed, 02 Nov 2022 16:46:16 -0500 KubeletReady kubelet is posting ready status Addresses: InternalIP: 10.110.2.122 Hostname: ip-10-110-2-122.us-west-2.compute.internal InternalDNS: ip-10-110-2-122.us-west-2.compute.internal Capacity: attachable-volumes-aws-ebs: 25 cpu: 2 ephemeral-storage: 157274092Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 3965424Ki pods: 17 Allocatable: attachable-volumes-aws-ebs: 25 cpu: 1930m ephemeral-storage: 143870061124 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 3410416Ki pods: 17 System Info: Machine ID: ec20c0dbb3d85aae2b130c3e87f7cd60 System UUID: ec20c0db-b3d8-5aae-2b13-0c3e87f7cd60 Boot ID: 0ea8d0b4-6483-41bc-aafe-51cf057a8ce3 Kernel Version: 5.4.217-126.408.amzn2.x86_64 OS Image: Amazon Linux 2 Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.17 Kubelet Version: v1.21.14-eks-ba74326 Kube-Proxy Version: v1.21.14-eks-ba74326 ProviderID: aws:///us-west-2a/i-0b24bb6849f331fd7 Non-terminated Pods: (9 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age --------- ---- ------------ ---------- --------------- ------------- --- influxdb influxdb-influxdb2-0 500m (25%) 1 (51%) 500Mi (15%) 1Gi (30%) 3d23h kube-system aws-node-nxp7g 25m (1%) 0 (0%) 0 (0%) 0 (0%) 3d2h kube-system coredns-85d5b4454c-ltv5p 100m (5%) 0 (0%) 70Mi (2%) 170Mi (5%) 4d1h kube-system ebs-csi-node-xfswb 0 (0%) 0 (0%) 0 (0%) 0 (0%) 13m kube-system kube-proxy-mw4fh 100m (5%) 0 (0%) 0 (0%) 0 (0%) 4d1h monitoring promtail-hk4jj 100m (5%) 512m (26%) 128Mi (3%) 512Mi (15%) 2d20h prometheus prometheus-node-exporter-dfczh 0 (0%) 0 (0%) 0 (0%) 0 (0%) 2d20h staging staging-jupyter-v1-dd84f5bb8-xxmvm 0 (0%) 1 (51%) 0 (0%) 2Gi (61%) 2d teleport-cluster teleport-775d4574d7-mtdh7 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d1h Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 825m (42%) 2512m (130%) memory 698Mi (20%) 3754Mi (112%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) attachable-volumes-aws-ebs 0 0 Events: <none> kdpo etcd-0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 4m26s (x2 over 4m32s) default-scheduler 0/9 nodes are available: 9 pod has unbound immediate PersistentVolumeClaims. Warning FailedScheduling 7s (x5 over 4m23s) default-scheduler 0/9 nodes are available: 1 node(s) had volume node affinity conflict, 8 node(s) didn't match Pod's node affinity/selector. kd pvc data-etcd-0 Name: data-etcd-0 Namespace: etcd StorageClass: etcd Status: Bound Volume: pvc-99f699f7-83bd-474e-be72-402b2a2dc77c Labels: app.kubernetes.io/instance=etcd app.kubernetes.io/name=etcd Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com Finalizers: [kubernetes.io/pvc-protection] Capacity: 100Gi Access Modes: RWO VolumeMode: Filesystem Used By: etcd-0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ExternalProvisioning 5m4s (x2 over 5m4s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "ebs.csi.aws.com" or manually created by system administrator Normal Provisioning 5m4s ebs.csi.aws.com_ebs-csi-controller-7b59d4c568-bx7rz_994a65f4-6811-41bd-839b-437511b6ee50 External provisioner is provisioning volume for claim "etcd/data-etcd-0" Normal ProvisioningSucceeded 4m58s ebs.csi.aws.com_ebs-csi-controller-7b59d4c568-bx7rz_994a65f4-6811-41bd-839b-437511b6ee50 Successfully provisioned volume pvc-99f699f7-83bd-474e-be72-402b2a2dc77c ➜ kd pv pvc-99f699f7-83bd-474e-be72-402b2a2dc77c Name: pvc-99f699f7-83bd-474e-be72-402b2a2dc77c Labels: <none> Annotations: pv.kubernetes.io/provisioned-by: ebs.csi.aws.com Finalizers: [kubernetes.io/pv-protection] StorageClass: etcd Status: Bound Claim: etcd/data-etcd-0 Reclaim Policy: Retain Access Modes: RWO VolumeMode: Filesystem Capacity: 100Gi Node Affinity: Required Terms: Term 0: topology.ebs.csi.aws.com/zone in [us-west-2b] Message: Source: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: ebs.csi.aws.com FSType: ext4 VolumeHandle: vol-0ea10f98f0d754053 ReadOnly: false VolumeAttributes: storage.kubernetes.io/csiProvisionerIdentity=1667774479582-8081-ebs.csi.aws.com Events: <none> ➜ kd sc etcd Name: etcd IsDefaultClass: No Annotations: kubectl.kubernetes.io/last-applied-configuration={"allowVolumeExpansion":true,"allowedTopologies":[{"matchLabelExpressions":[{"key":"topology.ebs.csi.aws.com/zone","values":["us-west-2a","us-west-2b","us-west-2c"]}]}],"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"false"},"labels":{"argocd.argoproj.io/instance":"aws-ebs-csi-driver"},"name":"etcd"},"parameters":{"csi.storage.k8s.io/fstype":"ext4","iopsPerGB":"50","tagSpecification_1":"environment=staging","type":"io1"},"provisioner":"ebs.csi.aws.com","reclaimPolicy":"Retain","volumeBindingMode":"Immediate"} ,storageclass.kubernetes.io/is-default-class=false Provisioner: ebs.csi.aws.com Parameters: csi.storage.k8s.io/fstype=ext4,iopsPerGB=50,tagSpecification_1=environment=staging,type=io1 AllowVolumeExpansion: True MountOptions: <none> ReclaimPolicy: Retain VolumeBindingMode: Immediate AllowedTopologies: Term 0: topology.ebs.csi.aws.com/zone in [us-west-2a, us-west-2b, us-west-2c] Events: <none>
kube-system/ebs-csi-node-xfswb[ebs-plugin]: I1106 22:46:46.171364 1 node.go:454] NodeGetVolumeStats: called with args {VolumeId:vol-043af2d4d87d50674 VolumePath:/var/lib/kubelet/pods/8b4de6ba-3d9d-4da0-a732-27ed069a1534/volumes/kubernetes.io~csi/pvc-547b32a4-653c-4c6a-875e-60a47a0665c7/mount StagingTargetPath: XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0} kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-resizer]: I1106 22:46:58.037938 1 controller.go:295] Started PVC processing "etcd/data-etcd-0" kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-resizer]: I1106 22:46:58.037962 1 controller.go:318] PV bound to PVC "etcd/data-etcd-0" is not created yet kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:46:58.068985 1 controller.go:1337] provision "etcd/data-etcd-0" class "etcd": started kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:46:58.069592 1 controller.go:528] skip translation of storage class for plugin: ebs.csi.aws.com kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:46:58.070454 1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"etcd", Name:"data-etcd-0", UID:"99f699f7-83bd-474e-be72-402b2a2dc77c", APIVersion:"v1", ResourceVersion:"407986343", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "etcd/data-etcd-0" kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:04.428929 1 controller.go:774] create volume rep: {CapacityBytes:107374182400 VolumeId:vol-0ea10f98f0d754053 VolumeContext:map[] ContentSource:<nil> AccessibleTopology:[segments:<key:"topology.ebs.csi.aws.com/zone" value:"us-west-2b" > ] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0} kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:04.429014 1 controller.go:858] successfully created PV pvc-99f699f7-83bd-474e-be72-402b2a2dc77c for PVC data-etcd-0 and csi volume name vol-0ea10f98f0d754053 kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:04.429210 1 controller.go:1442] provision "etcd/data-etcd-0" class "etcd": volume "pvc-99f699f7-83bd-474e-be72-402b2a2dc77c" provisioned kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:04.429225 1 controller.go:1455] provision "etcd/data-etcd-0" class "etcd": succeeded kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:04.445959 1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"etcd", Name:"data-etcd-0", UID:"99f699f7-83bd-474e-be72-402b2a2dc77c", APIVersion:"v1", ResourceVersion:"407986343", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-99f699f7-83bd-474e-be72-402b2a2dc77c kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-resizer]: I1106 22:47:04.478422 1 controller.go:295] Started PVC processing "etcd/data-etcd-0" kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-resizer]: I1106 22:47:04.478465 1 controller.go:343] No need to resize PVC "etcd/data-etcd-0" kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:13.497894 1 reflector.go:536] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.CSINode total 15 items received kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-attacher]: I1106 22:47:14.703161 1 reflector.go:536] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.CSINode total 16 items received kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:23.487257 1 reflector.go:536] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.Node total 24 items received kube-system/ebs-csi-node-5km2w[ebs-plugin]: I1106 22:47:31.957518 1 node.go:517] NodeGetCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0} kube-system/ebs-csi-node-5km2w[ebs-plugin]: I1106 22:47:31.958606 1 node.go:454] NodeGetVolumeStats: called with args {VolumeId:vol-0736c0f1f93069570 VolumePath:/var/lib/kubelet/pods/d03d9ee9-25bc-4a14-a0cf-51ad85f895e3/volumes/kubernetes.io~csi/pvc-d8f4b13b-31bf-4fa1-9535-8a790d5fd1a9/mount StagingTargetPath: XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0} kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:40.481667 1 reflector.go:536] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.StorageClass total 7 items received kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-attacher]: I1106 22:47:41.709670 1 reflector.go:536] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.PersistentVolume total 12 items received kube-system/ebs-csi-node-xfswb[ebs-plugin]: I1106 22:47:52.386885 1 node.go:517] NodeGetCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0} kube-system/ebs-csi-node-xfswb[ebs-plugin]: I1106 22:47:52.417419 1 node.go:454] NodeGetVolumeStats: called with args {VolumeId:vol-043af2d4d87d50674 VolumePath:/var/lib/kubelet/pods/8b4de6ba-3d9d-4da0-a732-27ed069a1534/volumes/kubernetes.io~csi/pvc-547b32a4-653c-4c6a-875e-60a47a0665c7/mount StagingTargetPath: XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0} kube-system/ebs-csi-controller-7b59d4c568-bx7rz[csi-provisioner]: I1106 22:47:58.574598 1 reflector.go:536] sigs.k8s.io/sig-storage-lib-external-provisioner/v8/controller/controller.go:845: Watch close - *v1.PersistentVolume total 12 items received kube-system/ebs-csi-node-rnt7j[ebs-plugin]: I1106 22:48:09.972484 1 node.go:517] NodeGetCapabilities: called with args {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0} kube-system/ebs-csi-node-rnt7j[ebs-plugin]: I1106 22:48:09.973767 1 node.go:454] NodeGetVolumeStats: called with args {VolumeId:vol-0f8c7548f584fa546 VolumePath:/var/lib/kubelet/pods/ae16c978-0206-4a1b-9561-6b016f8391ea/volumes/kubernetes.io~csi/pvc-253a60b3-82fa-452c-8805-f360faa730a4/mount StagingTargetPath: XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
What you expected to happen?
PVC io1 to be created in us-west-2a as that's where the etcd node is located.
helm chart
controller: extraCreateMetadata: "true" extraVolumeTags: cluster: project-eks logLevel: 2 nodeSelector: ops: "true" replicaCount: 1 node: # tolerateAllTaints: true tolerations: - effect: NoSchedule operator: Exists logLevel: 4 storageClasses: - allowVolumeExpansion: true allowedTopologies: - matchLabelExpressions: - key: topology.ebs.csi.aws.com/zone values: - us-west-2a - us-west-2b - us-west-2c annotations: storageclass.kubernetes.io/is-default-class: "false" name: gp3 parameters: csi.storage.k8s.io/fstype: ext4 tagSpecification_1: environment=staging type: gp3 provisioner: ebs.csi.aws.com reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer - allowVolumeExpansion: true allowedTopologies: - matchLabelExpressions: - key: topology.ebs.csi.aws.com/zone values: - us-west-2a - us-west-2b - us-west-2c annotations: storageclass.kubernetes.io/is-default-class: "false" name: gp3-retain parameters: csi.storage.k8s.io/fstype: ext4 tagSpecification_1: environment=staging type: gp3 provisioner: ebs.csi.aws.com reclaimPolicy: Retain volumeBindingMode: WaitForFirstConsumer - allowVolumeExpansion: true allowedTopologies: - matchLabelExpressions: - key: topology.ebs.csi.aws.com/zone values: - us-west-2a - us-west-2b - us-west-2c annotations: storageclass.kubernetes.io/is-default-class: "false" name: etcd parameters: csi.storage.k8s.io/fstype: ext4 iopsPerGB: "50" tagSpecification_1: environment=staging type: io1 provisioner: ebs.csi.aws.com reclaimPolicy: Retain volumeBindingMode: Immediate sidecars: provisioner: logLevel: 4 attacher: logLevel: 4 snapshotter: logLevel: 4 resizer: logLevel: 4 nodeDriverRegistrar: logLevel: 4
Environment
kubectl version
➜ k version Client Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.6-eks-7d68063", GitCommit:"f24e667e49fb137336f7b064dba897beed639bad", GitTreeState:"clean", BuildDate:"2022-02-23T19:32:14Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.14-eks-fb459a0", GitCommit:"b07006b2e59857b13fe5057a956e86225f0e82b7", GitTreeState:"clean", BuildDate:"2022-10-24T20:32:54Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
- Driver version:
ebs-csi-controller-7b59d4c568-bx7rz ebs-plugin IfNotPresent public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver v1.12.1 ebs-csi-controller-7b59d4c568-bx7rz csi-provisioner IfNotPresent k8s.gcr.io/sig-storage/csi-provisioner v3.1.0 ebs-csi-controller-7b59d4c568-bx7rz csi-attacher IfNotPresent k8s.gcr.io/sig-storage/csi-attacher v3.4.0 ebs-csi-controller-7b59d4c568-bx7rz csi-resizer IfNotPresent k8s.gcr.io/sig-storage/csi-resizer v1.4.0 ebs-csi-controller-7b59d4c568-bx7rz liveness-probe IfNotPresent k8s.gcr.io/sig-storage/livenessprobe v2.6.0 ebs-csi-node-567pb ebs-plugin IfNotPresent public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver v1.12.1 ebs-csi-node-567pb node-driver-registrar IfNotPresent k8s.gcr.io/sig-storage/csi-node-driver-registrar v2.5.1 ebs-csi-node-567pb liveness-probe IfNotPresent k8s.gcr.io/sig-storage/livenessprobe v2.6.0
ok fixed with volumeBindingMode: WaitForFirstConsumer for the storage class etcd, didn't pay attention it was volumeBindingMode: WaitForFirstConsumer
volumeBindingMode: WaitForFirstConsumer
/kind bug
What happened?
I have a node labeled
etcd=true
in us-west-2a running etcd as statefulset inside the eks 1.21 I have provisioned aws ebs csi driver withhowever I always get PVC/PV in us-west-2b region
AccessibleTopology:[segments:<key:"topology.ebs.csi.aws.com/zone" value:"us-west-2b" >
as a result my etcd-0 pod is always in pending state
info in ebs controller and pods logs
What you expected to happen?
PVC io1 to be created in us-west-2a as that's where the etcd node is located.
helm chart
Environment
kubectl version
):ebs-csi-controller-7b59d4c568-bx7rz ebs-plugin IfNotPresent public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver v1.12.1 ebs-csi-controller-7b59d4c568-bx7rz csi-provisioner IfNotPresent k8s.gcr.io/sig-storage/csi-provisioner v3.1.0 ebs-csi-controller-7b59d4c568-bx7rz csi-attacher IfNotPresent k8s.gcr.io/sig-storage/csi-attacher v3.4.0 ebs-csi-controller-7b59d4c568-bx7rz csi-resizer IfNotPresent k8s.gcr.io/sig-storage/csi-resizer v1.4.0 ebs-csi-controller-7b59d4c568-bx7rz liveness-probe IfNotPresent k8s.gcr.io/sig-storage/livenessprobe v2.6.0 ebs-csi-node-567pb ebs-plugin IfNotPresent public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver v1.12.1 ebs-csi-node-567pb node-driver-registrar IfNotPresent k8s.gcr.io/sig-storage/csi-node-driver-registrar v2.5.1 ebs-csi-node-567pb liveness-probe IfNotPresent k8s.gcr.io/sig-storage/livenessprobe v2.6.0