pulumi / pulumi-kubernetes-operator

A Kubernetes Operator that automates the deployment of Pulumi Stacks
Apache License 2.0
221 stars 55 forks source link

Panic in reconcile loop and " Failed to watch *unstructured.Unstructured: failed to list *unstructured.Unstructured: the server could not find the requested resource" #344

Closed lukehoban closed 1 year ago

lukehoban commented 1 year ago

I was running the example in https://github.com/pulumi/pulumi-kubernetes-operator/pull/339, and when I updated the Stack resource to point to a new stackname (since I had used an invalid one initially), I saw this in logs.

E1019 04:36:47.002143       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: failed to list *unstructured.Unstructured: the server could not find the requested resource

Nothing else happened though, so I destroyed the app stack (the pulumi.com/v1/stack, Secret and source.toolkit.fluxcd.io/v1beta2/GitRpository resources). That led to this:

E1019 04:37:19.908322       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: failed to list *unstructured.Unstructured: the server could not find the requested resource
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x1659330]

goroutine 1098 [running]:
github.com/pulumi/pulumi-kubernetes-operator/pkg/controller/stack.(*ReconcileStack).Reconcile(0xc0009324b0, {0x1d8f7a0, 0xc0004c6de0}, {{{0xc0006b36b0?, 0x198d120?}, {0xc0006b36a0?, 0xc0006d9840?}}})
    /home/runner/work/pulumi-kubernetes-operator/pulumi-kubernetes-operator/pkg/controller/stack/stack_controller.go:462 +0x2770
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000950000, {0x1d8f6f8, 0xc000895800}, {0x18eaca0?, 0xc00055c3e0?})
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.0/pkg/internal/controller/controller.go:298 +0x303
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000950000, {0x1d8f6f8, 0xc000895800})
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.0/pkg/internal/controller/controller.go:253 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.0/pkg/internal/controller/controller.go:214 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.0/pkg/internal/controller/controller.go:210 +0x307
squaremo commented 1 year ago

/home/runner/work/pulumi-kubernetes-operator/pulumi-kubernetes-operator/pkg/controller/stack/stack_controller.go:462

This line is after it should have bailed from not finding the resource; no wonder it's an NPE. I'll start with a test case to reproduce.

squaremo commented 1 year ago

Thankfully, referring to an unknown group/kind doesn't cause an NPE (I just added a test case for that).

This is the line referred to in the crash log:

if instance.Status.LastUpdate != nil {
   if instance.Status.LastUpdate.LastSuccessfulCommit == currentCommit && !stack.ContinueResyncOnCommitMatch

~I'm struggling to see how that line leads to an NPE given the condition immediately before it. stack is a shared.StackSpec rather than a pointer. (L462 in the branch with the example is a closing brace).~

Ah! The field ContinueResyncOnCommitMatch is from the embedded *GitSource, and that is nil if none of its fields are set.

squaremo commented 1 year ago

Fixed by #346.

pulumi-bot commented 1 year ago

Cannot close issue without required labels: resolution/