CiscoDevNet / appdynamics-charts

Helm charts for AppDynamics
https://appdynamics.github.io/appdynamics-charts/
Apache License 2.0
21 stars 38 forks source link

Cluster agent crashes when CronJob runs #3

Closed waarg closed 5 years ago

waarg commented 5 years ago

When a CronJob runs, the cluster agent crashes. It looks like when a CronJob creates a new container, the agent tries to add the Job, but crashes because it has already added it?

I have tested this with various CronJobs and it can be reliably replicated.

Here's the stack trace:

time="2019-08-29T13:10:05Z" level=debug msg="Added Job: xxxxxxxx\n"
E0829 13:10:05.677312 1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 129 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x12e7d20, 0x22c6250)
/go/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65 +0x82
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go:47 +0x82
panic(0x12e7d20, 0x22c6250)
/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/appdynamics/cluster-agent/workers.(*JobsWorker).processObject(0xc0002e84b0, 0xc001758d80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/usr/local/go/src/github.com/appdynamics/cluster-agent/workers/jobs.go:216 +0x4e6
github.com/appdynamics/cluster-agent/workers.(*JobsWorker).onNewJob(0xc0002e84b0, 0x1436ce0, 0xc001758d80)
/usr/local/go/src/github.com/appdynamics/cluster-agent/workers/jobs.go:92 +0x142
github.com/appdynamics/cluster-agent/workers.(*JobsWorker).onNewJob-fm(0x1436ce0, 0xc001758d80)
/usr/local/go/src/github.com/appdynamics/cluster-agent/workers/jobs.go:70 +0x3e
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(0xc00057d080, 0xc00057d0b0, 0xc00057d0a0, 0x1436ce0, 0xc001758d80)
/go/src/k8s.io/client-go/tools/cache/controller.go:196 +0x49
k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0x0, 0xc000269e00, 0xc0004d65f0)
/go/src/k8s.io/client-go/tools/cache/shared_informer.go:608 +0x21d
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc0003c1e18, 0x42d752, 0xc0004d6620)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:284 +0x51
k8s.io/client-go/tools/cache.(*processorListener).run.func1()
/go/src/k8s.io/client-go/tools/cache/shared_informer.go:602 +0x79
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000319f68)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0003c1f68, 0xdf8475800, 0x0, 0x12c9301, 0xc000660720)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe
k8s.io/apimachinery/pkg/util/wait.Until(0xc000319f68, 0xdf8475800, 0xc000660720)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
k8s.io/client-go/tools/cache.(*processorListener).run(0xc00033ae00)
/go/src/k8s.io/client-go/tools/cache/shared_informer.go:600 +0x8d
k8s.io/client-go/tools/cache.(*processorListener).run-fm()
/go/src/k8s.io/client-go/tools/cache/shared_informer.go:444 +0x2a
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc000480760, 0xc0005e2550)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x4f
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x11668e6]
goroutine 129 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go:54 +0x108
panic(0x12e7d20, 0x22c6250)
/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/appdynamics/cluster-agent/workers.(*JobsWorker).processObject(0xc0002e84b0, 0xc001758d80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/usr/local/go/src/github.com/appdynamics/cluster-agent/workers/jobs.go:216 +0x4e6
github.com/appdynamics/cluster-agent/workers.(*JobsWorker).onNewJob(0xc0002e84b0, 0x1436ce0, 0xc001758d80)
/usr/local/go/src/github.com/appdynamics/cluster-agent/workers/jobs.go:92 +0x142
github.com/appdynamics/cluster-agent/workers.(*JobsWorker).onNewJob-fm(0x1436ce0, 0xc001758d80)
/usr/local/go/src/github.com/appdynamics/cluster-agent/workers/jobs.go:70 +0x3e
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(0xc00057d080, 0xc00057d0b0, 0xc00057d0a0, 0x1436ce0, 0xc001758d80)
/go/src/k8s.io/client-go/tools/cache/controller.go:196 +0x49
k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0x0, 0xc000269e00, 0xc0004d65f0)
/go/src/k8s.io/client-go/tools/cache/shared_informer.go:608 +0x21d
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc0003c1e18, 0x42d752, 0xc0004d6620)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:284 +0x51
k8s.io/client-go/tools/cache.(*processorListener).run.func1()
/go/src/k8s.io/client-go/tools/cache/shared_informer.go:602 +0x79
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000319f68)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0003c1f68, 0xdf8475800, 0x0, 0x12c9301, 0xc000660720)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe
k8s.io/apimachinery/pkg/util/wait.Until(0xc000319f68, 0xdf8475800, 0xc000660720)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
k8s.io/client-go/tools/cache.(*processorListener).run(0xc00033ae00)
/go/src/k8s.io/client-go/tools/cache/shared_informer.go:600 +0x8d
k8s.io/client-go/tools/cache.(*processorListener).run-fm()
/go/src/k8s.io/client-go/tools/cache/shared_informer.go:444 +0x2a
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc000480760, 0xc0005e2550)
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x4f
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62
sashaPM commented 5 years ago

@waarg The cronjob issue has been addressed in the latest version of the cluster agent, 0.3.0. Thanks for reporting it. Make sure to set deployment.image.pullPolicy to Always when redeploying the chart.

waarg commented 5 years ago

Got it. Working now, thanks.