Closed Mieszko96 closed 3 days ago
I ran into this as well. It looks like the EBS driver may not be encountering this because of some delays added: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/pull/1949.
Specifically, I see eks:node-manager
tainting the node after the driver has started:
I0326 02:14:25.282728 1 round_trippers.go:463] GET https://172.20.0.1:443/api/v1/nodes/ip-10-0-39-238.us-west-2.compute.internal
I0326 02:14:25.282736 1 round_trippers.go:469] Request Headers:
I0326 02:14:25.282742 1 round_trippers.go:473] Accept: application/json, */*
I0326 02:14:25.282747 1 round_trippers.go:473] User-Agent: aws-efs-csi-driver/v0.0.0 (linux/amd64) kubernetes/$Format
I0326 02:14:25.282753 1 round_trippers.go:473] Authorization: Bearer <masked>
I0326 02:14:25.298894 1 round_trippers.go:574] Response Status: 200 OK in 16 milliseconds
I0326 02:14:25.299687 1 node.go:486] "No taints to remove on node, skipping taint removal"
I0326 02:14:25.299702 1 driver.go:137] Listening for connections on address: &net.UnixAddr{Name:"/csi/csi.sock", Net:"unix"}
I0326 02:14:28.015603 1 node.go:311] NodeGetInfo: called with args
$ jq -r '.[0]["@message"] | .user.username, .objectRef, .requestObject, .requestReceivedTimestamp' logs-insights-results.json
eks:node-manager
{
"resource": "nodes",
"name": "ip-10-0-39-238.us-west-2.compute.internal",
"apiVersion": "v1"
}
{
"spec": {
"taints": [
{
"effect": "NoExecute",
"key": "efs.csi.aws.com/agent-not-ready",
"value": "true"
},
{
"effect": "NoExecute",
"key": "ebs.csi.aws.com/agent-not-ready",
"value": "true"
}
]
}
}
2024-03-26T02:14:31.296272Z
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
We released a new version(2.0.6) which has the Fix to GitHub and related PR. ECD for the Addons will be 07/31.
/kind bug
What happened? i saw new feature was requested in this ticket https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/1069
I have infrastructure with karpenter as autoscale tool. And we have problem with installing apps using EFS on brand new node. I added startupTaint to prevent it, but from my POV it looks like is deleting this taint to fast
i made some more steps how i testing this in karpenter ticket https://github.com/aws/karpenter-provider-aws/issues/5691
but it seems to be a problem on EFS side.
What you expected to happen?
How to reproduce it (as minimally and precisely as possible)?
Anything else we need to know?:
Environment
Please also attach debug logs to help us better diagnose