panic: runtime error: invalid memory address or nil pointer dereference

pauldtill commented 5 days ago

Describe the bug Pods are in a crash loop with the following error after upgrade to helm version 1.9.0 (also replicated with 1.9.1)

panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x7fc9c82df093]

goroutine 109 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.2/pkg/internal/controller/controller.go:111 +0x1e5
panic({0x7fc9c8be88c0?, 0x7fc9ca4132d0?})
    /usr/local/go/src/runtime/panic.go:770 +0x132
sigs.k8s.io/aws-load-balancer-controller/pkg/deploy/shield.(*defaultProtectionManager).GetProtection(0xc000510eb0, {0x7fc9c9059038, 0xc000d1fe30}, {0xc0019a3ea0, 0x67})
    /workspace/pkg/deploy/shield/protection_manager.go:128 +0x173
sigs.k8s.io/aws-load-balancer-controller/pkg/deploy/shield.(*protectionSynthesizer).synthesizeProtectionsOnLB(0xc001045b00, {0x7fc9c9059038, 0xc000d1fe30}, {0xc0019a3ea0, 0x67}, {0xc00182ebd8?, 0x0?, 0x0?})
    /workspace/pkg/deploy/shield/protection_synthesizer.go:63 +0x8e
sigs.k8s.io/aws-load-balancer-controller/pkg/deploy/shield.(*protectionSynthesizer).Synthesize(0xc001045b00, {0x7fc9c9059038, 0xc000d1fe30})
    /workspace/pkg/deploy/shield/protection_synthesizer.go:46 +0x18d
sigs.k8s.io/aws-load-balancer-controller/pkg/deploy.(*defaultStackDeployer).Deploy(0xc0003f98c0, {0x7fc9c9059038, 0xc000d1fe30}, {0x7fc9c905b400, 0xc00155f080})
    /workspace/pkg/deploy/stack_deployer.go:113 +0x110c
sigs.k8s.io/aws-load-balancer-controller/controllers/ingress.(*groupReconciler).buildAndDeployModel(0xc0001a6240, {0x7fc9c9059038, 0xc000d1fe30}, {{{0xc003ab4680, 0x1b}, {0xc003a9e670, 0xb}}, {0xc000e83a28, 0x1, 0x1}, ...})
    /workspace/controllers/ingress/group_controller.go:172 +0x2ef
sigs.k8s.io/aws-load-balancer-controller/controllers/ingress.(*groupReconciler).reconcile(0xc0001a6240, {0x7fc9c9059038, 0xc000d1fe30}, {{{0xc003ab4680?, 0xc001c07ce0?}, {0xc003a9e670?, 0x0?}}})
    /workspace/controllers/ingress/group_controller.go:132 +0x270
sigs.k8s.io/aws-load-balancer-controller/controllers/ingress.(*groupReconciler).Reconcile(0xc0001a6240, {0x7fc9c9059038?, 0xc000d1fe30?}, {{{0xc003ab4680?, 0x0?}, {0xc003a9e670?, 0xc001c07d10?}}})
    /workspace/controllers/ingress/group_controller.go:118 +0x2c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x7fc9c905d400?, {0x7fc9c9059038?, 0xc000d1fe30?}, {{{0xc003ab4680?, 0xb?}, {0xc003a9e670?, 0x0?}}})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.2/pkg/internal/controller/controller.go:114 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0001cf550, {0x7fc9c9059070, 0xc000510be0}, {0x7fc9c8d80220, 0xc00059a480})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.2/pkg/internal/controller/controller.go:311 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0001cf550, {0x7fc9c9059070, 0xc000510be0})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.2/pkg/internal/controller/controller.go:261 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.2/pkg/internal/controller/controller.go:222 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 176
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.2/pkg/internal/controller/controller.go:218 +0x486

Steps to reproduce

EKS version: 1.30 ALB ingress controller helm chart: 1.9.0 (and 1.9.1)

Expected outcome The pods should start and remain stable - the requirement to create a new LB seems to trigger the issue.

Environment

AWS Load Balancer controller version - helm chart 1.9.0 + 1.9.1
Kubernetes version - 1.30
Using EKS (yes/no), if so version? Yes

Additional Context: Cannot provide pod log at the moment, as it contains some info we shouldn't share, but I can extracts from it as needed.

I should also note - helm version 1.8.4 works without issue in the same clusters.

yocean-tseng commented 5 days ago

@pauldtill Have you checked the release note from helm chart 1.8.4 to 1.9.0 ?

pauldtill commented 5 days ago

@yocean-tseng ouch, sorry thanks for pointing out, complacency due to never having any manual requirements to upgrade before, never considered it!

pauldtill commented 5 days ago

Ok, I was a little hasty (again) @yocean-tseng - we have updated to 1.9.1 and applied the update from the release note, but the pods are still failing with the same issue.

There were a couple of warnings from the kubectl apply command, but they seemed "safe" to me -

kubectl apply -k github.com/aws/eks-charts/stable/aws-load-balancer-controller/crds?ref=master
Warning: resource customresourcedefinitions/ingressclassparams.elbv2.k8s.aws is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
customresourcedefinition.apiextensions.k8s.io/ingressclassparams.elbv2.k8s.aws configured
Warning: resource customresourcedefinitions/targetgroupbindings.elbv2.k8s.aws is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
customresourcedefinition.apiextensions.k8s.io/targetgroupbindings.elbv2.k8s.aws configured

shraddhabang commented 2 days ago

Hey @pauldtill , Thanks for bringing this to our attention. I have raised a PR for this issue. We will release the image soon for this.

kubernetes-sigs / aws-load-balancer-controller

panic: runtime error: invalid memory address or nil pointer dereference #3888