ansible / receptor

Project Receptor is a flexible multi-service relayer with remote execution and orchestration capabilities linking controllers with executors across a mesh of nodes.
Other
162 stars 81 forks source link

panic: runtime error: invalid memory address or nil pointer dereference #1186

Open JoelKle opened 2 days ago

JoelKle commented 2 days ago

We run AWX version 24.6.1 using the awx-operator in a k3s cluster.

Inside the awx-task pod the awx-ee container crashed with the following error message:

# kubectl -n awx logs awx-task-77b54f47c6-rbwbq -c awx-ee --previous
DEBUG 2024/10/23 01:10:38 Client connected to control service @
DEBUG 2024/10/23 01:10:38 Control service closed
DEBUG 2024/10/23 01:10:38 Client disconnected from control service @
DEBUG 2024/10/23 01:10:45 Client connected to control service @
DEBUG 2024/10/23 01:10:45 Control service closed
DEBUG 2024/10/23 01:10:45 Client disconnected from control service @
DEBUG 2024/10/23 01:10:47 Sending service advertisement: &{awx-task-77b54f47c6-rbwbq control 2024-10-23 01:10:47.771711122 +0000 UTC m=+1322181.989097998 1 map[type:Control Service] [{local false} {kubernetes-runtime-auth false} {kubernetes-incluster-auth false}]}
INFO 2024/10/23 01:10:49 Detected Error: EOF for pod awx-schwarz/automation-job-975007-sg52t. Will retry 5 more times.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x16222f5]

goroutine 533140 [running]:
github.com/ansible/receptor/pkg/workceptor.(*KubeUnit).kubeLoggingWithReconnect(0xc000410300, 0x44645d?, 0xc0004ac510, 0xc001016250, 0xc001016260)
    /source/pkg/workceptor/kubernetes.go:388 +0xb55
created by github.com/ansible/receptor/pkg/workceptor.(*KubeUnit).runWorkUsingLogger in goroutine 533086
    /source/pkg/workceptor/kubernetes.go:830 +0xba7

I've not been able to see any memory / cpu overload problems on the node running this pod.

receptor version:

$ receptor --version
1.4.8+d7fe592

Is this a bug in receptor? Let me know if you need more infos / logs.

Thx Joel

nmeilick commented 2 days ago

Causative seems to be the use of the return value of ParseTime() without checking for a nil value.