emqx / emqx-operator

A Kubernetes Operator for EMQX
https://www.emqx.com
Apache License 2.0
203 stars 64 forks source link

emqx-operator-controller-manage panic: runtime error: invalid memory address or nil pointer dereference #1053

Open zhengbucuo opened 1 month ago

zhengbucuo commented 1 month ago

Describe the bug emqx-operator-controller-manage bug logs

[A clear and concise description of what the bug is.](panic: runtime error: invalid memory address or nil pointer dereference)
I0518 03:46:05.103821       1 leaderelection.go:250] attempting to acquire leader lease emqx-operator-system/19fd6fcc.emqx.io...
{"level":"info","ts":"2024-05-18T03:46:05Z","logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate"}
{"level":"info","ts":"2024-05-18T03:46:05Z","logger":"controller-runtime.webhook","msg":"Serving webhook server","host":"","port":9443}
{"level":"info","ts":"2024-05-18T03:46:05Z","logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher"}
I0518 03:46:36.328989       1 leaderelection.go:260] successfully acquired lease emqx-operator-system/19fd6fcc.emqx.io
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting EventSource","controller":"emqxbroker","controllerGroup":"apps.emqx.io","controllerKind":"EmqxBroker","source":"kind source: *v1beta4.EmqxBroker"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting Controller","controller":"emqxbroker","controllerGroup":"apps.emqx.io","controllerKind":"EmqxBroker"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting EventSource","controller":"emqxplugin","controllerGroup":"apps.emqx.io","controllerKind":"EmqxPlugin","source":"kind source: *v1beta4.EmqxPlugin"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting Controller","controller":"emqxplugin","controllerGroup":"apps.emqx.io","controllerKind":"EmqxPlugin"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting EventSource","controller":"emqxenterprise","controllerGroup":"apps.emqx.io","controllerKind":"EmqxEnterprise","source":"kind source: *v1beta4.EmqxEnterprise"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting Controller","controller":"emqxenterprise","controllerGroup":"apps.emqx.io","controllerKind":"EmqxEnterprise"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting EventSource","controller":"rebalance","controllerGroup":"apps.emqx.io","controllerKind":"Rebalance","source":"kind source: *v2beta1.Rebalance"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting Controller","controller":"rebalance","controllerGroup":"apps.emqx.io","controllerKind":"Rebalance"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting EventSource","controller":"emqx","controllerGroup":"apps.emqx.io","controllerKind":"EMQX","source":"kind source: *v2beta1.EMQX"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting Controller","controller":"emqx","controllerGroup":"apps.emqx.io","controllerKind":"EMQX"}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting workers","controller":"emqxbroker","controllerGroup":"apps.emqx.io","controllerKind":"EmqxBroker","worker count":1}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting workers","controller":"emqx","controllerGroup":"apps.emqx.io","controllerKind":"EMQX","worker count":1}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting workers","controller":"rebalance","controllerGroup":"apps.emqx.io","controllerKind":"Rebalance","worker count":1}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting workers","controller":"emqxplugin","controllerGroup":"apps.emqx.io","controllerKind":"EmqxPlugin","worker count":1}
{"level":"info","ts":"2024-05-18T03:46:36Z","msg":"Starting workers","controller":"emqxenterprise","controllerGroup":"apps.emqx.io","controllerKind":"EmqxEnterprise","worker count":1}
{"level":"info","ts":"2024-05-18T03:46:37Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"emqx","controllerGroup":"apps.emqx.io","controllerKind":"EMQX","EMQX":{"name":"emqx","namespace":"emqx-operator-system"},"namespace":"emqx-operator-system","name":"emqx","reconcileID":"da3df036-b355-45d3-9f11-823ed8fd876c"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x68 pc=0x1752680]

goroutine 206 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x192fec0?, 0x2bac780?})
    /usr/local/go/src/runtime/panic.go:914 +0x21f
github.com/emqx/emqx-operator/controllers/apps/v2beta1.(*updateStatus).reconcile(0xc000480118, {0x1e3de08, 0xc0002004e0}, {{0xc000158900?, 0xc000472000?}, 0xc0002004e0?}, 0xc000472000, {0x1e3eaa0, 0xc0019c8580})
    /workspace/controllers/apps/v2beta1/update_emqx_status.go:122 +0x860
github.com/emqx/emqx-operator/controllers/apps/v2beta1.(*EMQXReconciler).Reconcile(0xc00041c030, {0x1e3de08, 0xc0002004e0}, {{{0xc00019b620?, 0x5?}, {0xc0006699bc?, 0xc00073dd08?}}})
    /workspace/controllers/apps/v2beta1/emqx_controller.go:137 +0x7d7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1e40e98?, {0x1e3de08?, 0xc0002004e0?}, {{{0xc00019b620?, 0xb?}, {0xc0006699bc?, 0x0?}}})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0000e77c0, {0x1e3de40, 0xc0005765a0}, {0x19df880?, 0xc000464fe0?})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:316 +0x3cc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0000e77c0, {0x1e3de40, 0xc0005765a0})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:266 +0x1af
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 136
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:223 +0x565

To Reproduce Updating kind: EMQX configuration will reproduce, replicates: 3 cannot be restored.

Expected behavior Normal operation, can update kind: EMQX yaml

Anything else we need to know?:

zhengbucuo commented 1 month ago
Rory-Z commented 1 month ago

I'm sorry, I can't reproduce this issue, could you please offter more information?

  1. set development = true in helm values, and show EMQX operator debug log.
  2. the EMQX pod log when EMQX opeator crashing.
  3. the EMQX customer resource status when EMQX opeator crashing, you can get it like this: kubectl get emqx $name -o yaml | yq '.status'
  4. I saw that the update of the EMQ X configuration caused the crash. Please describe in detail which configurations you updated.
zhengbucuo commented 1 month ago

The current situation is that emqx operator controller manager is constantly crashing LoopBackOff Only after the error occurred, the emqx operator controller manager was unable to function properly

2024-05-21T10:36:57.107828505+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.builder  Registering a validating webhook    {"GVK": "apps.emqx.io/v2beta1, Kind=Rebalance", "path": "/validate-apps-emqx-io-v2beta1-rebalance"}

2024-05-21T10:36:57.107832209+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.webhook  Registering webhook {"path": "/validate-apps-emqx-io-v2beta1-rebalance"}

2024-05-21T10:36:57.107835665+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.builder  Conversion webhook enabled  {"GVK": "apps.emqx.io/v2beta1, Kind=Rebalance"}

2024-05-21T10:36:57.107838828+08:00 2024-05-21T02:36:57Z    INFO    setup   starting manager

2024-05-21T10:36:57.107844860+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.metrics  Starting metrics server

2024-05-21T10:36:57.107858319+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.metrics  Serving metrics server  {"bindAddress": ":8080", "secure": false}

2024-05-21T10:36:57.107862678+08:00 2024-05-21T02:36:57Z    INFO    starting server {"kind": "health probe", "addr": "[::]:8081"}

2024-05-21T10:36:57.107865545+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.webhook  Starting webhook server

2024-05-21T10:36:57.108317563+08:00 I0521 02:36:57.108094       1 leaderelection.go:250] attempting to acquire leader lease emqx-operator-system/19fd6fcc.emqx.io...

2024-05-21T10:36:57.108330599+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.certwatcher  Updated current TLS certificate

2024-05-21T10:36:57.108333536+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.webhook  Serving webhook server  {"host": "", "port": 9443}

2024-05-21T10:36:57.108739587+08:00 2024-05-21T02:36:57Z    INFO    controller-runtime.certwatcher  Starting certificate watcher

2024-05-21T10:37:29.263855206+08:00 I0521 02:37:29.263202       1 leaderelection.go:260] successfully acquired lease emqx-operator-system/19fd6fcc.emqx.io

2024-05-21T10:37:29.263896398+08:00 2024-05-21T02:37:29Z    DEBUG   events  emqx-operator-controller-manager-75d7f9b6bc-pzqmq_76564f34-2ad3-4b1e-ba5f-e8ad051887b4 became leader    {"type": "Normal", "object": {"kind":"Lease","namespace":"emqx-operator-system","name":"19fd6fcc.emqx.io","uid":"af090fed-d209-45de-b829-8b79e758e93b","apiVersion":"coordination.k8s.io/v1","resourceVersion":"18811767"}, "reason": "LeaderElection"}

2024-05-21T10:37:29.263906749+08:00 2024-05-21T02:37:29Z    INFO    Starting EventSource    {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "source": "kind source: *v1beta4.EmqxBroker"}

2024-05-21T10:37:29.263910856+08:00 2024-05-21T02:37:29Z    INFO    Starting Controller {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker"}

2024-05-21T10:37:29.263914675+08:00 2024-05-21T02:37:29Z    INFO    Starting EventSource    {"controller": "emqxplugin", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxPlugin", "source": "kind source: *v1beta4.EmqxPlugin"}

2024-05-21T10:37:29.263918153+08:00 2024-05-21T02:37:29Z    INFO    Starting Controller {"controller": "emqxplugin", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxPlugin"}

2024-05-21T10:37:29.263921431+08:00 2024-05-21T02:37:29Z    INFO    Starting EventSource    {"controller": "rebalance", "controllerGroup": "apps.emqx.io", "controllerKind": "Rebalance", "source": "kind source: *v2beta1.Rebalance"}

2024-05-21T10:37:29.263924823+08:00 2024-05-21T02:37:29Z    INFO    Starting Controller {"controller": "rebalance", "controllerGroup": "apps.emqx.io", "controllerKind": "Rebalance"}

2024-05-21T10:37:29.263931092+08:00 2024-05-21T02:37:29Z    INFO    Starting EventSource    {"controller": "emqx", "controllerGroup": "apps.emqx.io", "controllerKind": "EMQX", "source": "kind source: *v2beta1.EMQX"}

2024-05-21T10:37:29.263934420+08:00 2024-05-21T02:37:29Z    INFO    Starting Controller {"controller": "emqx", "controllerGroup": "apps.emqx.io", "controllerKind": "EMQX"}

2024-05-21T10:37:29.263938557+08:00 2024-05-21T02:37:29Z    INFO    Starting EventSource    {"controller": "emqxenterprise", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxEnterprise", "source": "kind source: *v1beta4.EmqxEnterprise"}

2024-05-21T10:37:29.263941999+08:00 2024-05-21T02:37:29Z    INFO    Starting Controller {"controller": "emqxenterprise", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxEnterprise"}

2024-05-21T10:37:29.365909536+08:00 2024-05-21T02:37:29Z    INFO    Starting workers    {"controller": "emqxplugin", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxPlugin", "worker count": 1}

2024-05-21T10:37:29.366957684+08:00 2024-05-21T02:37:29Z    INFO    Starting workers    {"controller": "emqx", "controllerGroup": "apps.emqx.io", "controllerKind": "EMQX", "worker count": 1}

2024-05-21T10:37:29.367001397+08:00 2024-05-21T02:37:29Z    INFO    Starting workers    {"controller": "emqxenterprise", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxEnterprise", "worker count": 1}

2024-05-21T10:37:29.367007682+08:00 2024-05-21T02:37:29Z    INFO    Starting workers    {"controller": "emqxbroker", "controllerGroup": "apps.emqx.io", "controllerKind": "EmqxBroker", "worker count": 1}

2024-05-21T10:37:29.367010643+08:00 2024-05-21T02:37:29Z    INFO    Starting workers    {"controller": "rebalance", "controllerGroup": "apps.emqx.io", "controllerKind": "Rebalance", "worker count": 1}

2024-05-21T10:37:30.151637271+08:00 2024-05-21T02:37:30Z    INFO    Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference    {"controller": "emqx", "controllerGroup": "apps.emqx.io", "controllerKind": "EMQX", "EMQX": {"name":"emqx","namespace":"emqx-operator-system"}, "namespace": "emqx-operator-system", "name": "emqx", "reconcileID": "6d6ca434-7b12-446a-8d03-839f00543a49"}

2024-05-21T10:37:30.154049720+08:00 panic: runtime error: invalid memory address or nil pointer dereference [recovered]

2024-05-21T10:37:30.154080671+08:00     panic: runtime error: invalid memory address or nil pointer dereference

2024-05-21T10:37:30.154086250+08:00 [signal SIGSEGV: segmentation violation code=0x1 addr=0x68 pc=0x1752680]

2024-05-21T10:37:30.154089931+08:00 

2024-05-21T10:37:30.154094333+08:00 goroutine 208 [running]:

2024-05-21T10:37:30.154098328+08:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()

2024-05-21T10:37:30.154102997+08:00     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:116 +0x1e5

2024-05-21T10:37:30.154107312+08:00 panic({0x192fec0?, 0x2bac780?})

2024-05-21T10:37:30.154112002+08:00     /usr/local/go/src/runtime/panic.go:914 +0x21f

2024-05-21T10:37:30.154124451+08:00 github.com/emqx/emqx-operator/controllers/apps/v2beta1.(*updateStatus).reconcile(0xc000124080, {0x1e3de08, 0xc0004807b0}, {{0xc0001db7a0?, 0xc000024000?}, 0xc0004807b0?}, 0xc000024000, {0x1e3eaa0, 0xc0007c0800})

2024-05-21T10:37:30.154128421+08:00     /workspace/controllers/apps/v2beta1/update_emqx_status.go:122 +0x860

2024-05-21T10:37:30.154132555+08:00 github.com/emqx/emqx-operator/controllers/apps/v2beta1.(*EMQXReconciler).Reconcile(0xc000586210, {0x1e3de08, 0xc0004807b0}, {{{0xc0003475d8?, 0x5?}, {0xc0005223f4?, 0xc000317d08?}}})

2024-05-21T10:37:30.154136707+08:00     /workspace/controllers/apps/v2beta1/emqx_controller.go:137 +0x7d7

2024-05-21T10:37:30.154145494+08:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1e40e98?, {0x1e3de08?, 0xc0004807b0?}, {{{0xc0003475d8?, 0xb?}, {0xc0005223f4?, 0x0?}}})

2024-05-21T10:37:30.154153199+08:00     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:119 +0xb7

2024-05-21T10:37:30.154173998+08:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000583400, {0x1e3de40, 0xc000475b30}, {0x19df880?, 0xc00031c200?})

2024-05-21T10:37:30.154181387+08:00     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:316 +0x3cc

2024-05-21T10:37:30.154191876+08:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000583400, {0x1e3de40, 0xc000475b30})

2024-05-21T10:37:30.154199911+08:00     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:266 +0x1af

2024-05-21T10:37:30.154206782+08:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()

2024-05-21T10:37:30.154214154+08:00     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:227 +0x79

2024-05-21T10:37:30.154221062+08:00 created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 133

2024-05-21T10:37:30.154245428+08:00     /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:223 +0x565