Closed levanlongktmt closed 2 months ago
I updated to karpenter 0.35 and using AL2023 image, the problem still happen
some update: it seemly caused by elasticsearch statefulset, time to terminate es pod is so slow, it's possible that the driver pod be killed before es pod killed then the volume not released
update: when I set PRE_STOP_ADDITIONAL_WAIT_SECONDS
to 5 seconds, new pod can attach pvc normally, so I think the long delay of pod stop is reason for volume stuck
update: when I set
PRE_STOP_ADDITIONAL_WAIT_SECONDS
to 5 seconds, new pod can attach pvc normally, so I think the long delay of pod stop is reason for volume stuck
Nope, in another try pv still stuck 🤦
Hey there @levanlongktmt thank you for reporting this.
It appears your spot instances are being ungracefully terminated. Take a look at the documentation for guidance on navigating this issue: 6-Minute Delays in Attaching Volumes - What steps can be taken to mitigate this issue?
If you are still running into delays but have already configured the Kubelet and enabled interruption handling in Karpenter, please let us know. Thanks!
@torredil ops... my user data missed last line
systemctl restart kubelet
let me try with it
hey @torredil I tried but still no luck :(
I already added SQS to handle interrupt, so when I trigger interupt in aws console, karpenter handle and launch new instance very quickly. I also follow logs of ebs-csi-node
but it not show anything related to prestop hook, here is all logs of it
I0304 15:40:38.249081 1 driver.go:83] "Driver Information" Driver="ebs.csi.aws.com" Version="v1.28.0"
I0304 15:40:38.250134 1 node.go:93] "regionFromSession Node service" region=""
I0304 15:40:38.252657 1 metadata.go:85] "retrieving instance data from ec2 metadata"
I0304 15:40:38.257835 1 metadata.go:92] "ec2 metadata is available"
I0304 15:40:38.258488 1 metadata_ec2.go:25] "Retrieving EC2 instance identity metadata" regionFromSession=""
I0304 15:47:29.651847 1 mount_linux.go:243] Detected OS without systemd
after graceful shutdown of elasticsearch pod, the pod ebs-csi-node
has been killed immediately. So the prestop has been ignored.
I was thought it might because bug of AL2023, so I switch back to use AL2 but still no luck.
I will try set InhibitDelayMaxSec
to 60 (45 + 15)
Still no luck, volume still be stuck in 6 minutes and prestop hook not working 😭
@levanlongktmt Thanks for the followup info. Something that came to mind here is that it might be possible for newer versions of Karpenter to be incompatible with the pre-stop lifecycle hook due to this change: https://github.com/kubernetes-sigs/karpenter/pull/508.
You'll notice that the LCH won't run if the following taint is not present on the node during a termination event:
node.kubernetes.io/unschedulable
We'll take a closer look and report back with more details or a fix. Thank you!
@torredil here is logs of karpenter and elasticsearch pod, so as I see
Logs of Karpenter
{"level":"INFO","time":"2024-03-04T16:22:20.270Z","logger":"controller.interruption","message":"initiating delete from interruption message","commit":"2c8f2a5","queue":"Karpenter-kollekt-eks","messageKind":"SpotInterruptionKind","nodeclaim":"default-2bd27","action":"CordonAndDrain","node":"i-0502b920c79f40b5a.eu-central-1.compute.internal"}
{"level":"INFO","time":"2024-03-04T16:22:20.336Z","logger":"controller.node.termination","message":"tainted node","commit":"2c8f2a5","node":"i-0502b920c79f40b5a.eu-central-1.compute.internal"}
{"level":"INFO","time":"2024-03-04T16:22:21.364Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"2c8f2a5","pods":"datalayer/dev-es-default-0","duration":"24.462475ms"}
{"level":"INFO","time":"2024-03-04T16:22:21.364Z","logger":"controller.provisioner","message":"computed new nodeclaim(s) to fit pod(s)","commit":"2c8f2a5","nodeclaims":1,"pods":1}
{"level":"INFO","time":"2024-03-04T16:22:21.379Z","logger":"controller.provisioner","message":"created nodeclaim","commit":"2c8f2a5","nodepool":"default","nodeclaim":"default-pk44s","requests":{"cpu":"680m","memory":"5240Mi","pods":"5"},"instance-types":"a1.2xlarge, a1.4xlarge, a1.metal, a1.xlarge, c6g.12xlarge and 91 other(s)"}
{"level":"INFO","time":"2024-03-04T16:22:23.782Z","logger":"controller.nodeclaim.lifecycle","message":"launched nodeclaim","commit":"2c8f2a5","nodeclaim":"default-pk44s","provider-id":"aws:///eu-central-1a/i-09a52eea14ff0b6c9","instance-type":"r6gd.medium","zone":"eu-central-1a","capacity-type":"spot","allocatable":{"cpu":"940m","ephemeral-storage":"17Gi","memory":"7075Mi","pods":"8","vpc.amazonaws.com/pod-eni":"4"}}
{"level":"INFO","time":"2024-03-04T16:22:31.356Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"2c8f2a5","pods":"datalayer/dev-es-default-0","duration":"16.586479ms"}
{"level":"INFO","time":"2024-03-04T16:22:41.353Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"2c8f2a5","pods":"datalayer/dev-es-default-0","duration":"13.509969ms"}
{"level":"INFO","time":"2024-03-04T16:22:51.355Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"2c8f2a5","pods":"datalayer/dev-es-default-0","duration":"14.534034ms"}
{"level":"INFO","time":"2024-03-04T16:22:58.280Z","logger":"controller.nodeclaim.lifecycle","message":"registered nodeclaim","commit":"2c8f2a5","nodeclaim":"default-pk44s","provider-id":"aws:///eu-central-1a/i-09a52eea14ff0b6c9","node":"i-09a52eea14ff0b6c9.eu-central-1.compute.internal"}
{"level":"INFO","time":"2024-03-04T16:23:01.356Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"2c8f2a5","pods":"datalayer/dev-es-default-0","duration":"13.781092ms"}
{"level":"INFO","time":"2024-03-04T16:23:09.571Z","logger":"controller.node.termination","message":"deleted node","commit":"2c8f2a5","node":"i-0502b920c79f40b5a.eu-central-1.compute.internal"}
{"level":"INFO","time":"2024-03-04T16:23:09.944Z","logger":"controller.nodeclaim.termination","message":"deleted nodeclaim","commit":"2c8f2a5","nodeclaim":"default-2bd27","node":"i-0502b920c79f40b5a.eu-central-1.compute.internal","provider-id":"aws:///eu-central-1a/i-0502b920c79f40b5a"}
{"level":"INFO","time":"2024-03-04T16:23:10.455Z","logger":"controller.disruption","message":"pod \"datalayer/dev-es-default-0\" has a preferred Anti-Affinity which can prevent consolidation","commit":"2c8f2a5"}
{"level":"INFO","time":"2024-03-04T16:23:10.698Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"2c8f2a5","pods":"datalayer/dev-es-default-0","duration":"83.215948ms"}
{"level":"INFO","time":"2024-03-04T16:23:10.731Z","logger":"controller.nodeclaim.lifecycle","message":"initialized nodeclaim","commit":"2c8f2a5","nodeclaim":"default-pk44s","provider-id":"aws:///eu-central-1a/i-09a52eea14ff0b6c9","node":"i-09a52eea14ff0b6c9.eu-central-1.compute.internal","allocatable":{"cpu":"940m","ephemeral-storage":"18233774458","hugepages-1Gi":"0","hugepages-2Mi":"0","hugepages-32Mi":"0","hugepages-64Ki":"0","memory":"7552796Ki","pods":"8"}}
{"level":"INFO","time":"2024-03-04T16:23:12.487Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"2c8f2a5","pods":"datalayer/dev-es-default-0","duration":"12.009019ms"}
Logs of elasticsearch pod
{"@timestamp": "2024-03-04T16:22:20+00:00", "message": "retrieving node ID", "ecs.version": "1.2.0", "event.dataset": "elasticsearch.pre-stop-hook"}
{"@timestamp": "2024-03-04T16:22:21+00:00", "message": "initiating node shutdown", "ecs.version": "1.2.0", "event.dataset": "elasticsearch.pre-stop-hook"}
{"@timestamp":"2024-03-04T16:22:21.338Z", "log.level": "INFO", "message":"creating shutdown record {nodeId=[qElhr0BPQnufjz5_CdDGMg], type=[RESTART], reason=[pre-stop hook]}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[dev-es-default-0][masterService#updateTask][T#2]","log.logger":"org.elasticsearch.xpack.shutdown.TransportPutShutdownNodeAction","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:22:21.530Z", "log.level": "INFO", "message":"Aborting health node task due to node [{dev-es-default-0}{qElhr0BPQnufjz5_CdDGMg}] shutting down.", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[dev-es-default-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.health.node.selection.HealthNodeTaskExecutor","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:22:21.535Z", "log.level": "INFO", "message":"Starting node shutdown sequence for ML", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[dev-es-default-0][clusterApplierService#updateTask][T#1]","log.logger":"org.elasticsearch.xpack.ml.MlLifeCycleService","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp": "2024-03-04T16:22:21+00:00", "message": "waiting for node shutdown to complete", "ecs.version": "1.2.0", "event.dataset": "elasticsearch.pre-stop-hook"}
{"@timestamp": "2024-03-04T16:22:21+00:00", "message": "delaying termination for 44 seconds", "ecs.version": "1.2.0", "event.dataset": "elasticsearch.pre-stop-hook"}
{"@timestamp":"2024-03-04T16:23:05.782Z", "log.level": "INFO", "message":"stopping ...", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.node.Node","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:05.783Z", "log.level": "INFO", "message":"shutting down watcher thread", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[file-watcher[/usr/share/elasticsearch/config/operator/settings.json]]","log.logger":"org.elasticsearch.common.file.AbstractFileWatchingService","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:05.785Z", "log.level": "INFO", "message":"watcher service stopped", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.common.file.AbstractFileWatchingService","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:05.790Z", "log.level": "INFO", "message":"[controller/83] [Main.cc@176] ML controller exiting", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"ml-cpp-log-tail-thread","log.logger":"org.elasticsearch.xpack.ml.process.logging.CppLogMessageHandler","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:05.792Z", "log.level": "INFO", "message":"Native controller process has stopped - no new native processes can be started", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"ml-cpp-log-tail-thread","log.logger":"org.elasticsearch.xpack.ml.process.NativeController","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:05.793Z", "log.level": "INFO", "message":"stopping watch service, reason [shutdown initiated]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.xpack.watcher.WatcherService","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:05.794Z", "log.level": "INFO", "message":"watcher has stopped and shutdown", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[dev-es-default-0][generic][T#3]","log.logger":"org.elasticsearch.xpack.watcher.WatcherLifeCycleService","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"timestamp": "2024-03-04T16:23:06+00:00", "message": "readiness probe failed", "curl_rc": "7"}
{"@timestamp":"2024-03-04T16:23:06.552Z", "log.level": "INFO", "message":"stopped", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.node.Node","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:06.554Z", "log.level": "INFO", "message":"closing ...", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.node.Node","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:06.576Z", "log.level": "INFO", "message":"evicted [0] entries from cache after reloading database [/tmp/elasticsearch-3976228296045036888/geoip-databases/qElhr0BPQnufjz5_CdDGMg/GeoLite2-Country.mmdb]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.ingest.geoip.DatabaseReaderLazyLoader","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:06.577Z", "log.level": "INFO", "message":"evicted [0] entries from cache after reloading database [/tmp/elasticsearch-3976228296045036888/geoip-databases/qElhr0BPQnufjz5_CdDGMg/GeoLite2-ASN.mmdb]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.ingest.geoip.DatabaseReaderLazyLoader","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:06.578Z", "log.level": "INFO", "message":"evicted [25] entries from cache after reloading database [/tmp/elasticsearch-3976228296045036888/geoip-databases/qElhr0BPQnufjz5_CdDGMg/GeoLite2-City.mmdb]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.ingest.geoip.DatabaseReaderLazyLoader","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
{"@timestamp":"2024-03-04T16:23:06.586Z", "log.level": "INFO", "message":"closed", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch-shutdown","log.logger":"org.elasticsearch.node.Node","elasticsearch.cluster.uuid":"Z_NFF7KcT3OJ6ekUUoJ3Qg","elasticsearch.node.id":"qElhr0BPQnufjz5_CdDGMg","elasticsearch.node.name":"dev-es-default-0","elasticsearch.cluster.name":"dev"}
@torredil do you think this quick and dirty fix will works 😆?
@torredil seemly k8s not call preStop
, because if it call then atleast I will see log PreStop: executing PreStop lifecycle hook
, but I didn't see it
@torredil any good news for this 😀?
Hi, sorry about the wait - your issue is probably caused by what #1969 solves - Karpenter changed the taints they used when draining nodes and our LCH needs to be changed to account for it. That fix should be available in the next release of the EBS CSI Driver, expected to happen later this month.
Amazing, thank so much @ConnorJC3 😍 I will test again when new version released
@ConnorJC3 I was trying to understand from which version of karpenter this was changed , i was about to upgrade my csi driver to address the fix but i guess i will wait for the next release.
I see https://github.com/kubernetes-sigs/karpenter/pull/508 was merged in October :thinking:
@primeroz here is list of karpenter versions affected
v1.29.0 has been released which contains the fix.
I just upgraded my dev cluster to 1.29.0
while using karpenter-aws 0.35.2
but still have the same multi attach problem as before
kubectl delete node
taint
26s Normal Evicted pod/c-pod-0 Evicted pod
Multi-Attach error for volume "pvc-9b50cf31-5a24-4783-8039-a362ad2a7c0d" Volume is already exclusively attached to one node and can't be attached to another
for 6+ minutes until timeout @primeroz do you have the logs from the driver, when your node was disrupted? (also, would be useful for you to increase your log level while collecting these)
@alexandermarston there are no events whatsoever when i delete the node in the ebs csi pod running on that node
I can see the prestop
is in the pod spec ... so i guess when we delete a node
rather than replace it through karpenter disruption the prestop
on the ebs node is never run ?
Note in my case i am deleting the node to get it replaced by karpenter.
+ ebs-csi-node-rj6qg › node-driver-registrar
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.381506 1 main.go:135] Version: v2.10.0
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.381558 1 main.go:136] Running node-driver-registrar in mode=
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.381565 1 main.go:157] Attempting to open a gRPC connection with: "/csi/csi.sock"
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.382280 1 main.go:164] Calling CSI driver to discover driver name
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.385954 1 main.go:173] CSI driver name: "ebs.csi.aws.com"
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.385996 1 node_register.go:55] Starting Registration Server at: /registration/ebs.csi.aws.com-reg.sock
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.386143 1 node_register.go:64] Registration Server started at: /registration/ebs.csi.aws.com-reg.sock
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.386255 1 node_register.go:88] Skipping HTTP server because endpoint is set to: ""
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.832022 1 main.go:90] Received GetInfo call: &InfoRequest{}
ebs-csi-node-rj6qg node-driver-registrar I0322 14:33:43.857979 1 main.go:101] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
ebs-csi-node-rj6qg liveness-probe I0322 14:33:44.130483 1 main.go:133] "Calling CSI driver to discover driver name"
ebs-csi-node-rj6qg ebs-plugin {"ts":1711118022623.64,"caller":"driver/driver.go:84","msg":"Driver Information","v":0,"Driver":"ebs.csi.aws.com","Version":"v1.29.0"}
ebs-csi-node-rj6qg ebs-plugin {"ts":1711118022623.6697,"caller":"driver/node.go:97","msg":"regionFromSession Node service","v":0,"region":""}
ebs-csi-node-rj6qg ebs-plugin {"ts":1711118022623.6895,"caller":"cloud/metadata.go:85","msg":"retrieving instance data from ec2 metadata","v":0}
ebs-csi-node-rj6qg liveness-probe I0322 14:33:44.133455 1 main.go:141] "CSI driver name" driver="ebs.csi.aws.com"
ebs-csi-node-rj6qg liveness-probe I0322 14:33:44.133475 1 main.go:170] "ServeMux listening" address="0.0.0.0:9808"
ebs-csi-node-rj6qg ebs-plugin {"ts":1711118022625.4724,"caller":"cloud/metadata.go:92","msg":"ec2 metadata is available","v":0}
ebs-csi-node-rj6qg ebs-plugin {"ts":1711118022626.2078,"caller":"cloud/metadata_ec2.go:25","msg":"Retrieving EC2 instance identity metadata","v":0,"regionFromSession":""}
ebs-csi-node-rj6qg ebs-plugin {"ts":1711118023644.7207,"caller":"driver/node.go:875","msg":"Unexpected failure when attempting to remove node taint(s)","err":"isAllocatableSet: driver not found on node ip-10-1-5-195.us-west-2.compute.internal"}
ebs-csi-node-rj6qg ebs-plugin {"ts":1711118024155.2048,"caller":"driver/node.go:961","msg":"CSINode Allocatable value is set","v":0,"nodeName":"ip-10-1-5-195.us-west-2.compute.internal","count":25}
ebs-csi-node-rj6qg ebs-plugin {"ts":1711118326119.883,"caller":"mount-utils@v0.29.2/mount_linux.go:243","msg":"Detected OS without systemd","v":2}
### Delete node here
- ebs-csi-node-rj6qg › ebs-plugin
- ebs-csi-node-rj6qg › liveness-probe
+ ebs-csi-node-rj6qg › liveness-probe
- ebs-csi-node-rj6qg › node-driver-registrar
+ ebs-csi-node-rj6qg › ebs-plugin
+ ebs-csi-node-rj6qg › node-driver-registrar
- ebs-csi-node-rj6qg › ebs-plugin
- ebs-csi-node-rj6qg › node-driver-registrar
- ebs-csi-node-rj6qg › liveness-probe
@primeroz what exactly do you mean by "delete a node"? Assuming you mean something like deleting the node via the AWS Console - that sounds like ungraceful termination, which will not run pre-stop hooks as it results in an immediate and uninterruptible shutdown.
@ConnorJC3 I mean kubectl delete node XXX
which is something karpenter supports and manages through finalizer
, it goes through the same steps as a karpenter managed disruption
- see https://karpenter.sh/docs/getting-started/getting-started-with-karpenter/#8-delete-karpenter-nodes-manually
Anyway i am testing now with a karpenter disruption by adding an annotation
to nodepool, which will trigger a full replacement of all nodes created by that nodepool since the specs have changed. same multi-attach error
happens
collecting logs on next node termination
OK, I've been able to test this by manually tainting a node and then running the preStop hook with:
k exec ebs-csi-node-l7zzf -- "/bin/aws-ebs-csi-driver" "pre-stop-hook"
The preStop hook is doing everything it should, from what I understand.
Again, from my limited understanding, I imagine the issue is that your service (which is using the volume) is taking longer than the terminationGracePeriodSeconds
of the EBS CSI driver to shutdown and release the volume. If your service takes longer than 30 seconds to terminate, then the EBS driver will never have a chance to do its work.
You could either try increasing the terminationGracePeriod of the EBS CSI Driver or lower the grace period of your service.
If your service takes longer than 30 seconds to terminate, then the EBS driver will never have a chance to do its work.
You could either try increasing the terminationGracePeriod of the EBS CSI Driver or lower the grace period of your service.
This is entirely possible, i am testing indeed with a service with an higher than normal
grace period
I will test this :pray:
@alexandermarston
Increasing the terminationGracePeriod: 3700
on the ebs node daemonset did not help . triggered an update of nodes in ndoepool by updating an annotation but still failing with multi-attach errors
Also what's interesting, to me at least, is that in the logs of the ebs csi node i never see klog.InfoS("PreStop: executing PreStop lifecycle hook")
so i guess that when karpenter
replaces a node it does never trigger a termination of the daemonsets running on it ?
I know for sure that karpenter does not drain the node
in the same way that kubectl drain does
. I can see that the non daemonset
pods are being evicted one by one
but I see nothing on the ebs nodes
I have been watching the ebs-csi-node
pod , i have been tailing events for both the ebs-csi-node
pod ... nothing. it seems like once the non daemonsets pods are all evicted the node just goes away and so all the daemonsets pods on it :thinking:
You won't see the logs from the preStop hook handler, as they are not logged in the same way normal pod stdout is. You will be able to see if the preStop hook failed in the Kubernetes Events though.
Can you try deleting the node again, but manually execute the preStop hook and share the output?
k exec <EBS-CSI-POD-NAME> -- "/bin/aws-ebs-csi-driver" "pre-stop-hook"
You won't see the logs from the preStop hook handler, as they are not logged in the same way normal pod stdout is.
To be specific, Kubernetes logs output from pre-stop hooks to the kubelet
log - so you can look there to determine if the hook is running (although this can be difficult on a terminated node if you aren't streaming the logs anywhere).
@alexandermarston 's suggestion above is a good idea to check that the hook is able to run and connect to the Kubernetes API server on your cluster.
@ConnorJC3 @alexandermarston thanks for the input. i thought all prestop
logs would go to events that's why i was watching them
I will do both of this things, given i have only 3 nodes with 3 pods i will
k exec <EBS-CSI-POD-NAME> -- "/bin/aws-ebs-csi-driver" "pre-stop-hook"
as soon as the statefulset pod for my workload
on that node goes into terminating
state and report back here
my cluster using eks addon to setup ebs driver and the addon still in v1.28 😂 @ConnorJC3 is it safe if I just delete add on then re install via helm chart?
k exec gave me mixed results.
the output looked like it did not have any error nor any problem talking to apiserver
➜ k exec -n kube-system ebs-csi-node-jzpf5 -- /bin/aws-ebs-csi-driver pre-stop-hook
Defaulted container "ebs-plugin" out of: ebs-plugin, node-driver-registrar, liveness-probe
I0322 16:11:40.620309 35 prestop.go:32] "PreStop: executing PreStop lifecycle hook"
I0322 16:11:40.631461 35 prestop.go:45] "PreStop: node is being drained, checking for remaining VolumeAttachments" node="ip-10-1-3-134.us-west-2.compute.internal"
I0322 16:11:40.649793 35 prestop.go:123] "isVolumeAttachmentEmpty: not ready to exit, found VolumeAttachment" attachment={"metadata":{"name":"csi-5ca3c6bf5a8925b2895507cbf7ffd7b4607324e09ef8a8ae2baf300a6eab3511","uid":"aee7ad0f-fa95-49
42-9093-3ff2ff64c927","resourceVersion":"800979653","creationTimestamp":"2024-03-22T16:04:23Z","annotations":{"csi.alpha.kubernetes.io/node-id":"i-0a4477e4eb2332f3f"},"finalizers":["external-attacher/ebs-csi-aws-com"],"managedFields":[{"man
ager":"csi-attacher","operation":"Update","apiVersion":"storage.k8s.io/v1","time":"2024-03-22T16:04:23Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:csi.alpha.kubernetes.io/node-id":{}},"f:finalizers":{".":{
},"v:\"external-attacher/ebs-csi-aws-com\"":{}}}}},{"manager":"kube-controller-manager","operation":"Update","apiVersion":"storage.k8s.io/v1","time":"2024-03-22T16:04:23Z","fieldsType":"FieldsV1","fieldsV1":{"f:spec":{"f:attacher":{},"f:nod
eName":{},"f:source":{"f:persistentVolumeName":{}}}}},{"manager":"csi-attacher","operation":"Update","apiVersion":"storage.k8s.io/v1","time":"2024-03-22T16:04:25Z","fieldsType":"FieldsV1","fieldsV1":{"f:status":{"f:attached":{},"f:attachmen
tMetadata":{".":{},"f:devicePath":{}}}},"subresource":"status"}]},"spec":{"attacher":"ebs.csi.aws.com","source":{"persistentVolumeName":"pvc-15457627-f532-41ca-b87b-a961ef7b745c"},"nodeName":"ip-10-1-3-134.us-west-2.compute.internal"},"stat
us":{"attached":true,"attachmentMetadata":{"devicePath":"/dev/xvdaa"}}} node="ip-10-1-3-134.us-west-2.compute.internal"
I0322 16:11:58.414556 35 prestop.go:123] "isVolumeAttachmentEmpty: not ready to exit, found VolumeAttachment" attachment={"metadata":{"name":"csi-5ca3c6bf5a8925b2895507cbf7ffd7b4607324e09ef8a8ae2baf300a6eab3511","uid":"aee7ad0f-fa95-49
42-9093-3ff2ff64c927","resourceVersion":"800988339","creationTimestamp":"2024-03-22T16:04:23Z","deletionTimestamp":"2024-03-22T16:11:58Z","deletionGracePeriodSeconds":0,"annotations":{"csi.alpha.kubernetes.io/node-id":"i-0a4477e4eb2332f3f"}
,"finalizers":["external-attacher/ebs-csi-aws-com"],"managedFields":[{"manager":"csi-attacher","operation":"Update","apiVersion":"storage.k8s.io/v1","time":"2024-03-22T16:04:23Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotati
ons":{".":{},"f:csi.alpha.kubernetes.io/node-id":{}},"f:finalizers":{".":{},"v:\"external-attacher/ebs-csi-aws-com\"":{}}}}},{"manager":"kube-controller-manager","operation":"Update","apiVersion":"storage.k8s.io/v1","time":"2024-03-22T16:04
:23Z","fieldsType":"FieldsV1","fieldsV1":{"f:spec":{"f:attacher":{},"f:nodeName":{},"f:source":{"f:persistentVolumeName":{}}}}},{"manager":"csi-attacher","operation":"Update","apiVersion":"storage.k8s.io/v1","time":"2024-03-22T16:04:25Z","f
ieldsType":"FieldsV1","fieldsV1":{"f:status":{"f:attached":{},"f:attachmentMetadata":{".":{},"f:devicePath":{}}}},"subresource":"status"}]},"spec":{"attacher":"ebs.csi.aws.com","source":{"persistentVolumeName":"pvc-15457627-f532-41ca-b87b-a
961ef7b745c"},"nodeName":"ip-10-1-3-134.us-west-2.compute.internal"},"status":{"attached":true,"attachmentMetadata":{"devicePath":"/dev/xvdaa"}}} node="ip-10-1-3-134.us-west-2.compute.internal"
in 1 test there was no multi-attach error
but in next 2 i tried there was multi-attach error
Not sure if it makes any different but in the successful test the loop run twice
and the not ready to exit
was indeed shown twice. In the failed tests it only run once than command returned.
Watching kubelet logs on the AWS Console through Session Manager ( this is the full log from the moment the node starts to get drained by karpenter , i only remove the Probe Failed
lines and some lines about a broken daemonset pods failing to start)
multi attach error
happened during this test - Multi-Attach error for volume "pvc-15457627-f532-41ca-b87b-a961ef7b745c" Volume is already exclusively attached to one node and can't be attached to another
and the volumeattachment
still pointed to the old node
I don't see any log from the ebs prestop hook
in the kubelet logs ( added the grepped logs for ebs at the bottom ) I don't see the ebs pod
go into terminate
ever
[root@ip-10-1-1-116 ~]# journalctl -u kubelet -f
...
...
Mar 22 16:30:58 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:30:58.615431 1904 scope.go:115] "RemoveContainer" containerID="8f79c502f8bcc0b7aabbead15947e3f3777d2400dd5fe6fd869fd2d5597995ee"
Mar 22 16:31:06 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:06.752227 1904 kubelet.go:2224] "SyncLoop DELETE" source="api" pods="[ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0]"
Mar 22 16:31:11 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:11.616133 1904 scope.go:115] "RemoveContainer" containerID="8f79c502f8bcc0b7aabbead15947e3f3777d2400dd5fe6fd869fd2d5597995ee"
Mar 22 16:31:12 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:12.617083 1904 kubelet.go:2312] "SyncLoop (probe)" probe="readiness" status="" pod="ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0"
Mar 22 16:31:22 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:22.615312 1904 scope.go:115] "RemoveContainer" containerID="8f79c502f8bcc0b7aabbead15947e3f3777d2400dd5fe6fd869fd2d5597995ee"
Mar 22 16:31:25 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:25.363057 1904 kuberuntime_container.go:709] "Killing container with a grace period" pod="ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0" podUID=3be54537-a4eb-4821-b6b6-83b18151da79 containerName="scraper-container" containerID="containerd://308e34c6dde2613473579be9a50bb4d9e890ebe63bdf0a9b10ee0eeab35e04ee" gracePeriod=3600
Mar 22 16:31:25 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:25.499261 1904 generic.go:332] "Generic (PLEG): container finished" podID=3be54537-a4eb-4821-b6b6-83b18151da79 containerID="308e34c6dde2613473579be9a50bb4d9e890ebe63bdf0a9b10ee0eeab35e04ee" exitCode=2
Mar 22 16:31:25 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:25.499301 1904 kubelet.go:2240] "SyncLoop (PLEG): event for pod" pod="ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0" event=&{ID:3be54537-a4eb-4821-b6b6-83b18151da79 Type:ContainerDied Data:308e34c6dde2613473579be9a50bb4d9e890ebe63bdf0a9b10ee0eeab35e04ee}
Mar 22 16:31:37 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:37.615826 1904 scope.go:115] "RemoveContainer" containerID="8f79c502f8bcc0b7aabbead15947e3f3777d2400dd5fe6fd869fd2d5597995ee"
Mar 22 16:31:37 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:37.904856 1904 kuberuntime_container.go:709] "Killing container with a grace period" pod="ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0" podUID=3be54537-a4eb-4821-b6b6-83b18151da79 containerName="c-yellow-ig-25-server" containerID="containerd://fabf4f8b93ecdb567b7483574643152cff342d04cfee093e95b95e5c345366b6" gracePeriod=3600
Mar 22 16:31:38 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:38.521327 1904 generic.go:332] "Generic (PLEG): container finished" podID=3be54537-a4eb-4821-b6b6-83b18151da79 containerID="fabf4f8b93ecdb567b7483574643152cff342d04cfee093e95b95e5c345366b6" exitCode=0
Mar 22 16:31:38 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:38.521370 1904 kubelet.go:2240] "SyncLoop (PLEG): event for pod" pod="ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0" event=&{ID:3be54537-a4eb-4821-b6b6-83b18151da79 Type:ContainerDied Data:fabf4f8b93ecdb567b7483574643152cff342d04cfee093e95b95e5c345366b6}
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254006 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"clickhouse-server-users-d-volume\" (UniqueName: \"kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-users-d-volume\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254053 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"clickhouse-server-certificates-volume\" (UniqueName: \"kubernetes.io/secret/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-certificates-volume\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254134 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"ch-storage-volume\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254161 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"kube-api-access-gg5x8\" (UniqueName: \"kubernetes.io/projected/3be54537-a4eb-4821-b6b6-83b18151da79-kube-api-access-gg5x8\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254185 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"aws-iam-token\" (UniqueName: \"kubernetes.io/projected/3be54537-a4eb-4821-b6b6-83b18151da79-aws-iam-token\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254209 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"ch-cache-volume\" (UniqueName: \"kubernetes.io/host-path/3be54537-a4eb-4821-b6b6-83b18151da79-ch-cache-volume\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: W0322 16:31:39.254209 1904 empty_dir.go:525] Warning: Failed to clear quota on /var/lib/kubelet/pods/3be54537-a4eb-4821-b6b6-83b18151da79/volumes/kubernetes.io~configmap/clickhouse-server-users-d-volume: clearQuota called, but quotas disabled
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254232 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"clickhouse-server-scripts-volume\" (UniqueName: \"kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-scripts-volume\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254284 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/3be54537-a4eb-4821-b6b6-83b18151da79-ch-cache-volume" (OuterVolumeSpecName: "ch-cache-volume") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "ch-cache-volume". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254352 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"clickhouse-server-config-d-volume\" (UniqueName: \"kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-config-d-volume\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\"(UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: W0322 16:31:39.254345 1904 empty_dir.go:525] Warning: Failed to clear quota on /var/lib/kubelet/pods/3be54537-a4eb-4821-b6b6-83b18151da79/volumes/kubernetes.io~configmap/clickhouse-server-scripts-volume: clearQuota called, but quotas disabled
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254396 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"podinfo\" (UniqueName: \"kubernetes.io/downward-api/3be54537-a4eb-4821-b6b6-83b18151da79-podinfo\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254404 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-users-d-volume" (OuterVolumeSpecName: "clickhouse-server-users-d-volume") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "clickhouse-server-users-d-volume". PluginName "kubernetes.io/configmap", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: W0322 16:31:39.254447 1904 empty_dir.go:525] Warning: Failed to clear quota on /var/lib/kubelet/pods/3be54537-a4eb-4821-b6b6-83b18151da79/volumes/kubernetes.io~configmap/clickhouse-server-config-d-volume: clearQuota called, but quotas disabled
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254481 1904 reconciler_common.go:295] "Volume detached for volume \"clickhouse-server-users-d-volume\" (UniqueName: \"kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-users-d-volume\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254511 1904 reconciler_common.go:295] "Volume detached for volume \"ch-cache-volume\" (UniqueName: \"kubernetes.io/host-path/3be54537-a4eb-4821-b6b6-83b18151da79-ch-cache-volume\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254540 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-scripts-volume" (OuterVolumeSpecName: "clickhouse-server-scripts-volume") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "clickhouse-server-scripts-volume". PluginName "kubernetes.io/configmap", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.255008 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-config-d-volume" (OuterVolumeSpecName: "clickhouse-server-config-d-volume") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "clickhouse-server-config-d-volume". PluginName "kubernetes.io/configmap", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.255796 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/3be54537-a4eb-4821-b6b6-83b18151da79-kube-api-access-gg5x8" (OuterVolumeSpecName: "kube-api-access-gg5x8") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "kube-api-access-gg5x8". PluginName "kubernetes.io/projected", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.255805 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-certificates-volume" (OuterVolumeSpecName: "clickhouse-server-certificates-volume") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "clickhouse-server-certificates-volume". PluginName "kubernetes.io/secret", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.255869 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/3be54537-a4eb-4821-b6b6-83b18151da79-aws-iam-token" (OuterVolumeSpecName: "aws-iam-token") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "aws-iam-token". PluginName "kubernetes.io/projected", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.255896 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/downward-api/3be54537-a4eb-4821-b6b6-83b18151da79-podinfo" (OuterVolumeSpecName: "podinfo") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "podinfo". PluginName "kubernetes.io/downward-api", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.258987 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a" (OuterVolumeSpecName: "ch-storage-volume") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "pvc-15457627-f532-41ca-b87b-a961ef7b745c". PluginName "kubernetes.io/csi", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.355390 1904 reconciler_common.go:288] "operationExecutor.UnmountDevice started for volume \"pvc-15457627-f532-41ca-b87b-a961ef7b745c\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.355420 1904 reconciler_common.go:295] "Volume detached for volume \"kube-api-access-gg5x8\" (UniqueName: \"kubernetes.io/projected/3be54537-a4eb-4821-b6b6-83b18151da79-kube-api-access-gg5x8\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.355435 1904 reconciler_common.go:295] "Volume detached for volume \"aws-iam-token\" (UniqueName: \"kubernetes.io/projected/3be54537-a4eb-4821-b6b6-83b18151da79-aws-iam-token\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.355451 1904 reconciler_common.go:295] "Volume detached for volume \"clickhouse-server-scripts-volume\" (UniqueName: \"kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-scripts-volume\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.355465 1904 reconciler_common.go:295] "Volume detached for volume \"clickhouse-server-config-d-volume\" (UniqueName: \"kubernetes.io/configmap/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-config-d-volume\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.355478 1904 reconciler_common.go:295] "Volume detached for volume \"podinfo\" (UniqueName: \"kubernetes.io/downward-api/3be54537-a4eb-4821-b6b6-83b18151da79-podinfo\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.355490 1904 reconciler_common.go:295] "Volume detached for volume \"clickhouse-server-certificates-volume\" (UniqueName: \"kubernetes.io/secret/3be54537-a4eb-4821-b6b6-83b18151da79-clickhouse-server-certificates-volume\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath\"\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.375856 1904 operation_generator.go:1018] UnmountDevice succeeded for volume "pvc-15457627-f532-41ca-b87b-a961ef7b745c" (UniqueName: "kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a") on node "ip-10-1-1-116.us-west-2.compute.internal"
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.456346 1904 reconciler_common.go:295] "Volume detached for volume \"pvc-15457627-f532-41ca-b87b-a961ef7b745c\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"csi-8dcd123839e58f5246dc005b28f5783c0c42a4f2d3fc6411bdcae42d51fb5ebd\""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.524049 1904 kubelet.go:2240] "SyncLoop (PLEG): event for pod" pod="ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0" event=&{ID:3be54537-a4eb-4821-b6b6-83b18151da79 Type:ContainerDied Data:1df7f09be93bcfbfcd6e55d5056f31120a16f89c65a1dfa2eec595b01c90472b}
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.524091 1904 scope.go:115] "RemoveContainer" containerID="fabf4f8b93ecdb567b7483574643152cff342d04cfee093e95b95e5c345366b6"
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.529616 1904 scope.go:115] "RemoveContainer" containerID="308e34c6dde2613473579be9a50bb4d9e890ebe63bdf0a9b10ee0eeab35e04ee"
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.559205 1904 kubelet.go:2224] "SyncLoop DELETE" source="api" pods="[ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0]"
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.573266 1904 kubelet.go:2218] "SyncLoop REMOVE" source="api" pods="[ns-yellow-ig-25/c-yellow-ig-25-server-43l8h6r-0]"
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.618419 1904 kubelet_volumes.go:160] "Cleaned up orphaned pod volumes dir" podUID=3be54537-a4eb-4821-b6b6-83b18151da79 path="/var/lib/kubelet/pods/3be54537-a4eb-4821-b6b6-83b18151da79/volumes"
Hangup
sh-4.2$
sh-4.2$
sh-4.2$
λ \cat /dev/shm/logs.txt | grep -i ebs
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.254134 1904 reconciler_common.go:169] "operationExecutor.UnmountVolume started for volume \"ch-storage-volume\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a\") pod \"3be54537-a4eb-4821-b6b6-83b18151da79\" (UID: \"3be54537-a4eb-4821-b6b6-83b18151da79\") "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.258987 1904 operation_generator.go:904] UnmountVolume.TearDown succeeded for volume "kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a" (OuterVolumeSpecName: "ch-storage-volume") pod "3be54537-a4eb-4821-b6b6-83b18151da79" (UID: "3be54537-a4eb-4821-b6b6-83b18151da79"). InnerVolumeSpecName "pvc-15457627-f532-41ca-b87b-a961ef7b745c". PluginName "kubernetes.io/csi", VolumeGidValue ""
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.355390 1904 reconciler_common.go:288] "operationExecutor.UnmountDevice started for volume \"pvc-15457627-f532-41ca-b87b-a961ef7b745c\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" "
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.375856 1904 operation_generator.go:1018] UnmountDevice succeeded for volume "pvc-15457627-f532-41ca-b87b-a961ef7b745c" (UniqueName: "kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a") on node "ip-10-1-1-116.us-west-2.compute.internal"
Mar 22 16:31:39 ip-10-1-1-116.us-west-2.compute.internal kubelet[1904]: I0322 16:31:39.456346 1904 reconciler_common.go:295] "Volume detached for volume \"pvc-15457627-f532-41ca-b87b-a961ef7b745c\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0a500e7456c1b7b6a\") on node \"ip-10-1-1-116.us-west-2.compute.internal\" DevicePath \"csi-8dcd123839e58f5246dc005b28f5783c0c42a4f2d3fc6411bdcae42d51fb5ebd\""
@ConnorJC3 @alexandermarston my main concern right now is if i am the only one still facing the issue (because maybe some custom config on my side is still causing it ) or if you are able to reproduce it
Waiting to know from @levanlongktmt :pray:
At least from what i am doing it should be enough to reproduce it by
karpenter 0.35.2
and ebs 1.29.0
( I install through helm chart from this repo ) ➜ kubectl get nodepool testpool -o yaml | yq '.spec.template.metadata.annotations'
epoch-restart: "3"
my cluster using eks addon to setup ebs driver and the addon still in v1.28 😂 @ConnorJC3 is it safe if I just delete add on then re install via helm chart?
It should be safe, with one caveat: In the time between uninstalling the addon version and installing the Helm version, you won't be able to perform any volume-related operations (volumes will get stuck attempting to create/delete/attach/detach/etc). Normal operation should resume once the new version is installed.
Make sure that if you have any custom configuration that you wish to keep that you port it over to the Helm version.
Also note that we are currently working on releasing v1.29.0
as an EKS addon, expected to be released in all commercial regions by end of next week (2024-03-29).
@ConnorJC3 @alexandermarston my main concern right now is if i am the only one still facing the issue (because maybe some custom config on my side is still causing it ) or if you are able to reproduce it
Waiting to know from @levanlongktmt 🙏
At least from what i am doing it should be enough to reproduce it by
* Eks cluster ( i am on 1.26.12 🤔 ) with `karpenter 0.35.2` and `ebs 1.29.0` ( I install through helm chart from this repo ) * Nodepool with nothing special ( i run on graviton bu i doubt that has anything to do with it ) * A statefulset with PVC across multiple nodes in the nodepool * update an annotation on the nodepool like
➜ kubectl get nodepool testpool -o yaml | yq '.spec.template.metadata.annotations' epoch-restart: "3"
* Watch for the events
Just to make sure - have you enabled graceful termination for the kubelet on your nodes? Assuming you are using the EKS optimized AMIs that setting isn't enabled by default (EKS is working on it here: https://github.com/awslabs/amazon-eks-ami/pull/1544), we have instructions in our FAQ to configure the EKS AMI to do so (see step 2 in this section: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/docs/faq.md#what-steps-can-be-taken-to-mitigate-this-issue)
Just to make sure - have you enabled graceful termination for the kubelet on your nodes?
Glad you made sure, cause no i did not know about that :pray: I will on monday first thing and report back but i would say that explains it
thanks for your patience !
I decided just keep using add-ons so I don't need to check and update helm manually, Mar 29 is fine with me. I will test again and update result when new version of add-on released (our cluster is in eu-central-1)
@ConnorJC3 @alexandermarston following up from last week, i enabled Grace Shutdown
this morning and did some more testing
TLDR: It seems to me that , at least in my case, when the prestop hook
is run inside the ebs-csi-node
pod the node
has already been removed from the etcd store
and because of that the prestop hook fails with an error node not found
This make sense since the shutdown manager
is triggered by the node being shutdown on the cloud side and the node object is removed right after that. I can see the node vanish from _kubectl get node_
as soon as the shutdown is triggered
Trigger a node replacement using karpetenter drift management
by updating an annotation on the nodepool spec.template.metadata
The kubelet logs
are fetched through AWS Session Manager, they stream until the instance is actually terminated
Node has graceful shutdown enabled
as described in the FAQ
[root@ip-10-1-7-208 /]# journalctl -u kubelet | grep -i shutdown
Mar 25 11:13:51 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:13:51.507578 1754 nodeshutdown_manager_linux.go:137] "Creating node shutdown manager" shutdownGracePeriodRequested="45s" shutdownGracePeriodCriticalPods="15s" shutdownGracePeriodByPodPriority=[{Priority:0 ShutdownGracePeriodSeconds:30} {Priority:2000000000 ShutdownGracePeriodSeconds:15}]
1. When a node is replaced
by karpenter drift manager
the pod is evicted successfully, and the volume is unmounted and detached
0s Normal Evicted pod/c-yellow-ig-25-server-pvcv2mf-0 Evicted pod
Mar 25 11:24:39 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:39.817968 1754 kuberuntime_container.go:709] "Killing container with a grace period" pod="ns-yellow-ig-25/c-yellow-ig-25-server-pvcv2mf-0" podUID=b75386ed-40e0-4614-a552-6ddeb8b195da containerName="c-yellow-ig-25-server" containerID="containerd://09a8f4198773fc345c35bb035003ef266bdd6bb89b2ae4e2354951423cf498c6" gracePeriod=3600
Mar 25 11:24:40 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:40.933344 1754 generic.go:332] "Generic (PLEG): container finished" podID=b75386ed-40e0-4614-a552-6ddeb8b195da containerID="a5209c778ba56bba7c9770e97e7b9a763f634481e9cca0bc6f4b183c11e00dc2" exitCode=2
Mar 25 11:24:40 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:40.933371 1754 generic.go:332] "Generic (PLEG): container finished" podID=b75386ed-40e0-4614-a552-6ddeb8b195da containerID="09a8f4198773fc345c35bb035003ef266bdd6bb89b2ae4e2354951423cf498c6" exitCode=0
Mar 25 11:24:40 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:40.933452 1754 scope.go:115] "RemoveContainer" containerID="09a8f4198773fc345c35bb035003ef266bdd6bb89b2ae4e2354951423cf498c6"
Mar 25 11:24:40 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:40.939224 1754 scope.go:115] "RemoveContainer" containerID="a5209c778ba56bba7c9770e97e7b9a763f634481e9cca0bc6f4b183c11e00dc2"
Mar 25 11:24:40 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:40.944934 1754 scope.go:115] "RemoveContainer" containerID="09a8f4198773fc345c35bb035003ef266bdd6bb89b2ae4e2354951423cf498c6"
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:41.086647 1754 reconciler_common.go:295] "Volume detached for volume \"clickhouse-server-config-d-volume\" (UniqueName: \"kubernetes.io/configmap/b75386ed-40e0-4614-a552-6ddeb8b195da-clickhouse-server-config-d-volume\") on node \"ip-10-1-7-208.us-west-2.compute.internal\" DevicePath \"\""
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:41.115622 1754 operation_generator.go:1018] UnmountDevice succeeded for volume "pvc-9b50cf31-5a24-4783-8039-a362ad2a7c0d" (UniqueName: "kubernetes.io/csi/ebs.csi.aws.com^vol-09d0cbc269a5e3c66") on node "ip-10-1-7-208.us-west-2.compute.internal"
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:41.186926 1754 reconciler_common.go:295] "Volume detached for volume \"pvc-9b50cf31-5a24-4783-8039-a362ad2a7c0d\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-09d0cbc269a5e3c66\") on node \"ip-10-1-7-208.us-west-2.compute.internal\" DevicePath \"csi-ab4d8239111079de83d0e307498a5076d263aa3c8e12e1272bc34ab67a424568\""
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:41.253838 1754 kubelet.go:2224] "SyncLoop DELETE" source="api" pods="[ns-yellow-ig-25/c-yellow-ig-25-server-pvcv2mf-0]"
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:41.261334 1754 kubelet.go:2218] "SyncLoop REMOVE" source="api" pods="[ns-yellow-ig-25/c-yellow-ig-25-server-pvcv2mf-0]"
2. Node Shutdown manager takes over, i assume through karpenter deleting the cloud instance
The node disappear from kubectl get ndoe
at the same time the shutdown event is detected
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:41.967523 1754 nodeshutdown_manager_linux.go:265] "Shutdown manager detected new shutdown event, isNodeShuttingDownNow" event=true
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:41.967543 1754 nodeshutdown_manager_linux.go:273] "Shutdown manager detected new shutdown event" event="shutdown"
t="shutdown"
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:41.967567 1754 nodeshutdown_manager_linux.go:322] "Shutdown manager processing shutdown event"
3. Here kubelet is already failing toget the node
from apiserver
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: E0325 11:24:41.968804 1754 kubelet_node_status.go:545] "Error updating node status, will retry" err="error getting node \"ip-10-1-7-208.us-west-2.compute.internal\": nodes \"ip-10-1-7-208.us-west-2.compute.internal\" not found"
Mar 25 11:24:41 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: E0325 11:24:41.971332 1754 kubelet_node_status.go:545] "Error updating node status, will retry" err="error getting node \"ip-10-1-7-208.us-west-2.compute.internal\": nodes \"ip-10-1-7-208.us-west-2.compute.internal\" not found"
4. all the daemonsets pods start to get terminated , including ebs-csi-driver
Mar 25 11:24:42 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:42.973727 1754 nodeshutdown_manager_linux.go:375] "Shutdown manager killing pod with gracePeriod" pod="kube-system/ebs-csi-node-rl95f" gracePeriod=15
Mar 25 11:24:42 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:42.973742 1754 nodeshutdown_manager_linux.go:375] "Shutdown manager killing pod with gracePeriod" pod="kube-system/cilium-6cd2t" gracePeriod=1
Mar 25 11:24:42 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:42.974764 1754 kuberuntime_container.go:709] "Killing container with a grace period" pod="kube-system/ebs-csi-node-rl95f" podUID=0ad30749-c716-4c7d-846a-a786b08d13b7 containerName="node-driver-registrar" containerID="containerd://58bd98c6de45d5774f10967525996bf73f4932e6ad48b60569bde824c5a8d31a" gracePeriod=15
### Cilium is killed
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:43.007024 1754 kuberuntime_container.go:709] "Killing container with a grace period" pod="kube-system/cilium-6cd2t" podUID=cb194ca6-d9d3-48b0-9862-2d3cdaadd1f7 containerName="cilium-agent" containerID="containerd://a13f2b2f4ca37ebfa60119534aefd6fc4ad4094c4c38b05fb51fa3788d7f0ae7" gracePeriod=1
5. EBS pre stop fails , this i because by the time the prestop hook is run the node object has already been removed from etcd store
and the fetchNode fails
Increasing cilium cni terminationgraceperiod did not help , it defaults to 1 but it make sense that when terminating it does not flush the ebpf configuration or it would break nodes when there are upgrades to cilium and such.
also the error is about node xxxx not found
and not having problems connecting to the apiserver
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: E0325 11:24:43.038607 1754 handlers.go:78] "Exec lifecycle hook for Container in Pod failed" err=<
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: command '/bin/aws-ebs-csi-driver pre-stop-hook' exited with 1: I0325 11:24:43.020810 29 prestop.go:32] "PreStop: executing PreStop lifecycle hook"
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: E0325 11:24:43.035020 29 options.go:89] "failed to execute PreStop lifecycle hook" err="fetchNode: failed to retrieve node information: nodes \"ip-10-1-7-208.us-west-2.compute.internal\" not found"
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: > execCommand=[/bin/aws-ebs-csi-driver pre-stop-hook] containerName="ebs-plugin" pod="kube-system/ebs-csi-node-rl95f" message=<
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: I0325 11:24:43.020810 29 prestop.go:32] "PreStop: executing PreStop lifecycle hook"
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: E0325 11:24:43.035020 29 options.go:89] "failed to execute PreStop lifecycle hook" err="fetchNode: failed to retrieve node information: nodes \"ip-10-1-7-208.us-west-2.compute.internal\" not found"
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: E0325 11:24:43.038642 1754 kuberuntime_container.go:596] "PreStop hook failed" err=<
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: command '/bin/aws-ebs-csi-driver pre-stop-hook' exited with 1: I0325 11:24:43.020810 29 prestop.go:32] "PreStop: executing PreStop lifecycle hook"
Mar 25 11:24:43 ip-10-1-7-208.us-west-2.compute.internal kubelet[1754]: E0325 11:24:43.035020 29 options.go:89] "failed to execute PreStop lifecycle hook" err="fetchNode: failed to retrieve node information: nodes \"ip-10-1-7-208.us-west-2.compute.internal\" not found"
And i get the multi attach error with 6m timeout
0s Warning FailedAttachVolume pod/c-yellow-ig-25-server-pvcv2mf-0 Multi-Attach error for volume "pvc-9b50cf31-5a24-4783-8039-a362ad2a7c0d" Volume is already exclusively attached to one node and can't be attached
to another
Update For reference , i matched the logs from kubelet ( another node replacement event ) and AWS Logs Insights and i can see the same story.
By the time the ebs csi node
pod termination starts through the kubelet shutdown manager
the node is already gone from etcd and so the prestop hook fails.
It seems to me that what should be happening is that this 4 evictions, including the critical daemonsets should trigger the ebs-csi-node
prestop hook.
At least on my tests I never see an eviction to any system-critical daemonset pod in any log , not in kubelet not in Cloudwatch.
ip-10-1-9-107.us-west-2.compute.internal
starts ip-10-1-9-107.us-west-2.compute.internal
- karpenter.sh/disruption=disrupting
ip-10-1-11-3.us-west-2.compute.internal
is created and availablenodeclaim mixed-memory-ssd-arm64-wjkmf
ip-10-1-9-107.us-west-2.compute.internal
label
node.kubernetes.io/exclude-from-external-load-balancers=karpenter
to nodeeviction
for pod c-yellow-ig-25-server-hbzgsvd-0
nodeclaim mixed-memory-ssd-arm64-wjkmf
nodeclaim mixed-memory-ssd-arm64-wjkmf
nodeclaim mixed-memory-ssd-arm64-wjkmf
nodeclaim mixed-memory-ssd-arm64-wjkmf
nodeclaim mixed-memory-ssd-arm64-wjkmf
- The nodeclaim keep getting a delete every second as part of the finalizer ? system:node:ip-10-1-9-107.us-west-2.compute.internal
delete pod c-yellow-ig-25-server-hbzgsvd-0
c-yellow-ig-25-server-hbzgsvd-0
on target ip-10-1-11-3.us-west-2.compute.internal
nodeclaim mixed-memory-ssd-arm64-wjkmf
ip-10-1-9-107.us-west-2.compute.internal
Successfulsystem:serviceaccount:kube-system:attachdetach-controller
patch node ip-10-1-9-107
with {"status":{"volumesAttached":null}
ip-10-1-9-107
nodeclaim mixed-memory-ssd-arm64-wjkmf
- Last Eversystem:serviceaccount:kube-system:generic-garbage-collector
delete csinodes
, ciliumnodes
for ip-10-1-9-107
ip-10-1-9-107
and get a Failure back
node not found
- attempt to remove finalizer from node again ? Shutdown manager detected new shutdown event, isNodeShuttingDownNow
" event=trueError updating node status, will retry
" err="error getting node \"ip-10-1-9-107.us-west-2.compute.internal\": nodes \"ip-10-1-9-107.us-west-2.compute.internal\" not f
ound" err="update node status exceeds retry count
""Shutdown manager killing pod with gracePeriod" pod="kube-system/ebs-csi-node-7bk9k" gracePeriod=15
system:serviceaccount:kube-system:ebs-csi-node-sa
get Node ip-10-1-9-107
Failure odes "ip-10-1-9-107.us-west-2.compute.internal" not found
"PreStop: executing PreStop lifecycle hook"
"Exec lifecycle hook for Container in Pod failed" err=<command '/bin/aws-ebs-csi-driver pre-stop-hook' exited with 1
err="fetchNode: failed to retrieve node information: nodes \"ip-10-1-9-107.us-west-2.compute.internal\" not found"
"Shutdown manager pod killing time out"
gracePeriod=15 priority=2000000000"Shutdown manager completed processing shutdown event, node will shutdown shortly"
@levanlongktmt Honestly at this point i will just wait and see what your experience is when you get to upgrade the addon
@primeroz just some information, I didn't use kubectl delete node
to kill a node. All my nodes is spot instance so I kill node by go to spot request then choose initiate interuption to simulate aws kill ec2
@levanlongktmt thanks.
I am not using kubectl delete node
either anymore.
I am changing the specs of node pools to let karpenter replace nodes with drift manager. Similar to what an Ami upgrade would be.
Even though it seems to me that deleting a node or a node claim goes through the same codepath in karpenter.
I might try to do what you are doing just to check if there is any difference in behaviour.
To try and exclude the many variables
my dev environment might be injecting i built a simple POC - https://github.com/primeroz/eks-karpenter-ebs-poc
This is a barebone EKS cluster ( 1.26 to match my current version ) with EBS And Karpenter and nothing more.
2 simple statefulsets , same multi attach error
I am planning to test more tomorrow , maybe upgrade eks cluster versions, try the spot instance
usecase and such.
Using the POC above i did do a full run of tests on EKS 1.26 to 1.29 and all of them fail with the same problem https://github.com/primeroz/eks-karpenter-ebs-poc/blob/main/RESULTS.md
Setup is
The test is a
All type of karpenter node replacement failed with the Multi Attach Error
and 6 minute wait
@ConnorJC3 @alexandermarston This should be easy to reproduce with the code above that only need terraform and helm , happy to do any further test if there is anything else you think i should do or any other feature i should enable in order to make this work
:pray:
Some update from my side, I uninstalled addon and use helm to install v1.29 and still face multi-attach error. Karpenter version is v0.35.2
Here is Nodeclass and NodePool
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: experiment
spec:
amiFamily: AL2023
role: "xxxxxx"
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: "my-cluster-name"
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: "my-cluster-name"
userData: |
#!/bin/bash
echo -e "InhibitDelayMaxSec=45\n" >> /etc/systemd/logind.conf
systemctl restart systemd-logind
echo "$(jq ".shutdownGracePeriod=\"45s\"" /etc/kubernetes/kubelet/kubelet-config.json)" > /etc/kubernetes/kubelet/kubelet-config.json
echo "$(jq ".shutdownGracePeriodCriticalPods=\"15s\"" /etc/kubernetes/kubelet/kubelet-config.json)" > /etc/kubernetes/kubelet/kubelet-config.json
systemctl restart kubelet
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: elasticsearch
spec:
template:
spec:
requirements:
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
nodeClassRef:
name: experiment
limits:
cpu: 20
memory: 100Gi
disruption:
consolidationPolicy: WhenUnderutilized
expireAfter: 720h
I also just do trigger spot instance interruption and get multi-attach error
randomly.
userData: |
#!/bin/bash
echo -e "InhibitDelayMaxSec=45\n" >> /etc/systemd/logind.conf
systemctl restart systemd-logind
echo "$(jq ".shutdownGracePeriod=\"45s\"" /etc/kubernetes/kubelet/kubelet-config.json)" > /etc/kubernetes/kubelet/kubelet-config.json
echo "$(jq ".shutdownGracePeriodCriticalPods=\"15s\"" /etc/kubernetes/kubelet/kubelet-config.json)" > /etc/kubernetes/kubelet/kubelet-config.json
systemctl restart kubelet
you don't really need the restart of kubelet there since this userData is run before
the eks-bootstrap.sh script that configures other bits of kubelet and then start it ( I actually
don't think kubelet is even started when you get there )
Have same issue with EKS 1.28, EBS CSI Driver 1.29.0 and Karpenter 0.35.2
For what is worth , i am getting more and more convinced this is a karpenter problem that needs to be addressed there.
in particular the logic where a node gets removes as soon as the delete signal is sent from karpenter
, this is something that the karpenter dev team is thinking of changing
I have the same issue as well with EKS 1.27, EBS CSI Driver 1.29.1 and Karpenter 0.32.7.
Another way this could potentially be fixed would be for the ebs driver's prestop hook to check the error it receives when it tries to get the node. Currently if they get any error when trying to get the node, they just exit out. They should check if the error is that the node doesn't exist and in that case proceed with the volume attachment cleanup.
Update here: This change merged here (https://github.com/kubernetes-sigs/karpenter/pull/1164) should now wait for the instance termination to fully complete before it goes and removes the node. This should ideally completely eliminate this issue. We're targeting v0.37.0 for release of this change (a couple versions out, hopefully out in the next few weeks or so).
Linking the relevant Karpenter issue for this one: https://github.com/kubernetes-sigs/karpenter/issues/944. Feel free to add some comments/discussion over there as well.
/kind bug
What happened? As discussed at https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/1665, @torredil said it's fixed in v1.27 (https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/1665#issuecomment-1898607940) but we still got problem with v1.28
What you expected to happen?
How to reproduce it (as minimally and precisely as possible)?
Multi-Attach error for volume "pvc-xxxxx-xxxxx-xxx" Volume is already exclusively attached to one node and can't be attached to another
ebs-csi-controller
Client Version: v1.29.1 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.0-eks-c417bb3