Open johnthompson-ybor opened 4 days ago
Are you sure the controller is on 2.12.4? Not sure how this line can throw a nil pointer exception:
I thought the same thing, but I just confirmed and that's what version I'm on.
Just to clarify, argocd version outputs versions for argocd and argocd server. The first one is cli, the 2nd one is server-side. We need the 2nd one. Sorry if you already checked that and that's also 2.12.4.
Tho maybe we have some memory corruption.
Can you try with v2.13.1, please?
Checklist:
argocd version
.Describe the bug
Argocd version: v2.12.4+27d1e64
if you install any CRDs on the clusters with conversion webhooks, and the conversion webhook is down, then all applications on the cluster go to an Unknown or an error state:
Failed to load target state: failed to get cluster version for cluster "": failed to get cluster info for """: error synchronizing cache state : failed to sync cluster ": failed to load initial state of resource BucketServerSideEncryptionConfiguration.s3.aws.upbound.io: conversion webhook for s3.aws.upbound.io/v1beta1, Kind=BucketServerSideEncryptionConfiguration failed: Post "https://provider-aws-s3.crossplane-system.svc:9443/convert?timeout=30s": no endpoints available for service "provider-aws-s3"
If I have SSA on, the UI just gets stuck in "refreshing" and there's a nil pointer exception in the logs.
time="2024-11-18T14:19:18Z" level=error msg="Recovered from panic: runtime error: invalid memory address or nil pointer dereference
goroutine 294 [running]: runtime/debug.Stack() /usr/local/go/src/runtime/debug/stack.go:24 +0x5e
github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).processAppRefreshQueueItem.func1() /go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:1480 +0x54
panic({0x382cd20?, 0x7756330?}) /usr/local/go/src/runtime/panic.go:770 +0x132
github.com/argoproj/argo-cd/v2/controller.(*appStateManager).CompareAppState(0xc00055cd20, 0xc0dae6a408, 0xc0a7114488, {0xc0a792d6c0, 0x1, 0x1}, {0xc0a7920700, 0x1, 0x1}, 0x0, ...) /go/src/github.com/argoproj/argo-cd/controller/state.go:864 +0x5ff9
github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).processAppRefreshQueueItem(0xc0004dec40) /go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:1590 +0x1188
github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).Run.func3() /go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:830 +0x25
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?) /go/pkg/mod/k8s.io/apimachinery@v0.29.6/pkg/util/wait/backoff.go:226 +0x33
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000636b00, {0x5555d00, 0xc001cec2a0}, 0x1, 0xc000081f80) /go/pkg/mod/k8s.io/apimachinery@v0.29.6/pkg/util/wait/backoff.go:227 +0xaf
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000636b00, 0x3b9aca00, 0x0, 0x1, 0xc000081f80) /go/pkg/mod/k8s.io/apimachinery@v0.29.6/pkg/util/wait/backoff.go:204 +0x7f
k8s.io/apimachinery/pkg/util/wait.Until(...) /go/pkg/mod/k8s.io/apimachinery@v0.29.6/pkg/util/wait/backoff.go:161
created by github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).Run in goroutine 112 /go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:829 +0x865
To Reproduce
Install a CRD with a conversion webhook that goes to an unavailable endpoint.
Expected behavior
I'm not sure what the expected behavior should be. I don't think there should be a NPE when it happens in SSA at the very least.
It would be nice to be able to exclude those resources on an app by app basis, or be able to skip any resources that aren't included in the application? It basically means that if I need to do a new sync to fix the webhook, I can't really do it.
Screenshots
Version
Logs