fluent / fluent-operator

Operate Fluent Bit and Fluentd in the Kubernetes way - Previously known as FluentBit Operator
Apache License 2.0
587 stars 250 forks source link

bug: Fluent Operator pod crashes while processing FluentBitConfig namespaced resource #1231

Closed lpratas closed 4 months ago

lpratas commented 4 months ago

Describe the issue

The operator pod crashes with Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference error when processing FluentBitConfig namespaced resource.

Full error:

2024-07-01T18:44:45Z    INFO    Starting Controller {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd"}
2024-07-01T18:44:46Z    INFO    Starting workers    {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd", "worker count": 1}
2024-07-01T18:44:46Z    INFO    Starting workers    {"controller": "collector", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "Collector", "worker count": 1}
2024-07-01T18:44:46Z    INFO    Starting workers    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "worker count": 1}
2024-07-01T18:44:46Z    INFO    Starting workers    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "worker count": 1}
2024-07-01T18:44:46Z    INFO    Starting workers    {"controller": "fluentd", "controllerGroup": "fluentd.fluent.io", "controllerKind": "Fluentd", "worker count": 1}
2024-07-01T18:44:46Z    INFO    Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference    {"controller": "fluentbit", "controllerGroup": "fluentbit.fluent.io", "controllerKind": "FluentBit", "FluentBit": {"name":"fluent-bit","namespace":"fluent"}, "namespace": "fluent", "name": "fluent-bit", "reconcileID": "518f0b8c-37a4-4fe1-aa51-16b6191376fe"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xb0 pc=0x103d728]

goroutine 564 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:119 +0x1a4
panic({0x11aaa80?, 0x21d7430?})
    /usr/local/go/src/runtime/panic.go:914 +0x218
github.com/fluent/fluent-operator/v2/controllers.(*FluentBitConfigReconciler).generateRewriteTagConfig(_, {{{0x1087423, 0xf}, {0x4000a9e4c0, 0x1c}}, {{0x4000120e10, 0x13}, {0x0, 0x0}, {0x4000828230, ...}, ...}, ...}, ...)
    /workspace/controllers/fluentbitconfig_controller.go:402 +0x2b8
github.com/fluent/fluent-operator/v2/controllers.(*FluentBitConfigReconciler).processNamespacedFluentBitCfgs(_, {_, _}, {{{0x10748df, 0x9}, {0x4000a9e3e0, 0x1c}}, {{0x4000783160, 0xa}, {0x0, ...}, ...}, ...}, ...)
    /workspace/controllers/fluentbitconfig_controller.go:271 +0xbc0
github.com/fluent/fluent-operator/v2/controllers.(*FluentBitConfigReconciler).Reconcile(0x400055eb40, {0x160dff8, 0x4000285110}, {{{0x400078316a, 0x6}, {0x4000783160, 0xa}}})
    /workspace/controllers/fluentbitconfig_controller.go:150 +0x734
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x160dff8?, {0x160dff8?, 0x4000285110?}, {{{0x400078316a?, 0x1102620?}, {0x4000783160?, 0x0?}}})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122 +0x8c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0x400037b2c0, {0x160e030, 0x40002efea0}, {0x1221f60?, 0x400012b720?})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323 +0x2a0
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0x400037b2c0, {0x160e030, 0x40002efea0})
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274 +0x198
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235 +0x74
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 136
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:231 +0x43c

To Reproduce

  1. kind create cluster
  2. helm install fluent-operator fluent/fluent-operator -n fluent --create-namespace --version 2.9
  3. Create FluentBitConfig resource in the default namespace
    cat <<EOF | kubectl create -f -
    apiVersion: fluentbit.fluent.io/v1alpha2
    kind: FluentBitConfig
    metadata:
    labels:
    app.kubernetes.io/name: fluent-bit
    name: fluentbitconfig
    namespace: default
    spec:
    filterSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
    outputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
    EOF
  4. Delete the FluentBitConfig for fluent-operator to stop crashing: kubectl delete FluentBitConfig fluentbitconfig -n default

Expected behavior

Your Environment

- Fluent Operator version: 2.8 and 2.9
- Container Runtime: containerd
- Operating system: Linux
- Kernel version:

How did you install fluent operator?

helm install fluent-operator fluent/fluent-operator -n fluent --create-namespace --version 2.9

Additional context

No response

lpratas commented 4 months ago

Update: Specifying the emitterName seems to avoid the issue, like this:

cat <<EOF | kubectl create -f -
apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBitConfig
metadata:
  labels:
    app.kubernetes.io/name: fluent-bit
  name: fluentbitconfig
  namespace: default
spec:
  filterSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  outputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  service:
    emitterName: ""
EOF