aws-quickstart / cdk-eks-blueprints

AWS Quick Start Team
Apache License 2.0
460 stars 206 forks source link

[Karpenter] Make sure to preserve the SA during chart deletion #1012

Closed muckelba closed 5 months ago

muckelba commented 5 months ago

Problem

Karpenter fails to shut down its nodes properly during cluster deletion due to a permission error caused by the premature deletion of the service account:

panic: failed to setup manager: failed to determine if *v1.Lease is namespaced: failed to get restmapping: failed to get server groups: the server has asked for the client to provide credentials

goroutine 1 [running]:
github.com/samber/lo.must({0x26921c0, 0xc0009765a0}, {0xc000b17a18, 0x1, 0x1})
    github.com/samber/lo@v1.39.0/errors.go:51 +0x1bb
github.com/samber/lo.Must[...](...)
    github.com/samber/lo@v1.39.0/errors.go:65
sigs.k8s.io/karpenter/pkg/operator.NewOperator()
    sigs.k8s.io/karpenter@v0.36.0/pkg/operator/operator.go:183 +0x1418
main.main()
    github.com/aws/karpenter-provider-aws/cmd/controller/main.go:33 +0x25

Solution:

Updated the dependency statement to ensure the correct deletion order:

This prevents permission errors by ensuring the service account is deleted after Karpenter components.

Testing

I wasn't able to properly test this because everytime i npm linked my local version of this repo, i got this error:

Error: There is already a Construct with name 'HasEcrPublic' in Cluster [redacted]

The stacktrace points to the first new eks.KubernetesManifest() call in my code. Is using npm link the recommended method to test local changes?

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.