open-telemetry / opentelemetry-operator

Kubernetes Operator for OpenTelemetry Collector
Apache License 2.0
1.18k stars 418 forks source link

Operator Panic and CrashLoop for invalid prometheus exporter endpoint Collector #2628

Closed Starefossen closed 6 months ago

Starefossen commented 7 months ago

Component(s)

No response

What happened?

Description

Operator exists with the following panic and enters CrashLoopBackoff after applying an OpenTelemetryCollector (see bellow) and stays crashing until the OpenTelemetryCollector is deleted from the cluster. It is not possible to edit the OpenTelemetryCollector due to the failing webhook.

Steps to Reproduce

Apply the following OpenTelemetryCollector resource:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: opentelemetry-collector-management-internet
  namespace: my-system
spec:
  config: |
    receivers:
      otlp:
        protocols:
          http:
            endpoint: "http://localhost:4318/"
    processors:
      batch: {}
      memory_limiter:
        check_interval: 5s
        limit_mib: 4000
        spike_limit_mib: 500
      attributes:
        actions:
          - key: source
            value: internet
            action: insert
    exporters:
      prometheus:
        endpoint: prometheus
      otlp:
        endpoint: http://tempo:4317
        tls:
          insecure: true
      loki:
        endpoint: http://loki:3100/loki/api/v1/push
        tls:
          insecure: true
    service:
      pipelines:
        metrics:
          receivers: [ otlp ]
          processors: [ batch, memory_limiter ]
          exporters: [ prometheus ]
        traces:
          receivers: [ otlp ]
          processors: [ batch, memory_limiter ]
          exporters: [ otlp ]
        logs:
          receivers: [ otlp ]
          processors: [ batch, memory_limiter ]
          exporters: [ loki ]
  deploymentUpdateStrategy: {}
  ingress:
    route: {}
  managementState: managed
  mode: deployment
  observability:
    metrics: {}
  podDisruptionBudget:
    maxUnavailable: 1
  podSecurityContext:
    fsGroup: 65532
    runAsGroup: 65532
    runAsNonRoot: true
    runAsUser: 65532
    seccompProfile:
      type: RuntimeDefault
  ports:
  - name: otlphttp
    port: 4318
    protocol: TCP
    targetPort: 4318
  replicas: 1
  resources: {}
  securityContext:
    allowPrivilegeEscalation: false
    capabilities:
      drop:
      - ALL
    readOnlyRootFilesystem: true
    runAsNonRoot: true
    runAsUser: 65532
    seccompProfile:
      type: RuntimeDefault
  targetAllocator:
    allocationStrategy: consistent-hashing
    filterStrategy: relabel-config
    observability:
      metrics: {}
    prometheusCR:
      scrapeInterval: 30s
    resources: {}
  updateStrategy: {}
  upgradeStrategy: automatic

Expected Result

Operator should not panic and instead create the OpenTelemetry Collector.

Actual Result

{"level":"info","ts":"2024-02-15T16:12:21Z","msg":"Starting the OpenTelemetry Operator","opentelemetry-operator":"0.93.0","opentelemetry-collector":"otel/opentelemetry-collector-contrib:0.93.0","opentelemetry-targetallocator":"ghcr.io/open-telemetry/opentelemetry-operator/target-allocator:0.93.0","operator-opamp-bridge":"ghcr.io/open-telemetry/opentelemetry-operator/operator-opamp-bridge:0.93.0","auto-instrumentation-java":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:1.32.0","auto-instrumentation-nodejs":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:0.46.0","auto-instrumentation-python":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.43b0","auto-instrumentation-dotnet":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:1.2.0","auto-instrumentation-go":"ghcr.io/open-telemetry/opentelemetry-go-instrumentation/autoinstrumentation-go:v0.10.1-alpha","auto-instrumentation-apache-httpd":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.4","auto-instrumentation-nginx":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.4","feature-gates":"operator.autoinstrumentation.apache-httpd,operator.autoinstrumentation.dotnet,-operator.autoinstrumentation.go,operator.autoinstrumentation.java,-operator.autoinstrumentation.multi-instrumentation,-operator.autoinstrumentation.nginx,operator.autoinstrumentation.nodejs,operator.autoinstrumentation.python,operator.collector.rewritetargetallocator,-operator.observability.prometheus","build-date":"2024-02-02T17:52:35Z","go-version":"go1.21.6","go-arch":"amd64","go-os":"linux","labels-filter":[]}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"setup","msg":"the env var WATCH_NAMESPACE isn't set, watching all namespaces"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a mutating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpenTelemetryCollector","path":"/mutate-opentelemetry-io-v1alpha1-opentelemetrycollector"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/mutate-opentelemetry-io-v1alpha1-opentelemetrycollector"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a validating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpenTelemetryCollector","path":"/validate-opentelemetry-io-v1alpha1-opentelemetrycollector"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/validate-opentelemetry-io-v1alpha1-opentelemetrycollector"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a mutating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=Instrumentation","path":"/mutate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/mutate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a validating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=Instrumentation","path":"/validate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/validate-opentelemetry-io-v1alpha1-instrumentation"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/mutate-v1-pod"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a mutating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpAMPBridge","path":"/mutate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/mutate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.builder","msg":"Registering a validating webhook","GVK":"opentelemetry.io/v1alpha1, Kind=OpAMPBridge","path":"/validate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/validate-opentelemetry-io-v1alpha1-opampbridge"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"setup","msg":"starting manager"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","ts":"2024-02-15T16:12:21Z","msg":"starting server","kind":"health probe","addr":"[::]:8081"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":"0.0.0.0:8080","secure":false}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Starting webhook server"}
I0215 16:12:21.427966       1 leaderelection.go:250] attempting to acquire leader lease my-system/9f7554c3.opentelemetry.io...
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate"}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.webhook","msg":"Serving webhook server","host":"","port":9443}
{"level":"info","ts":"2024-02-15T16:12:21Z","logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher"}
I0215 16:15:12.024007       1 leaderelection.go:260] successfully acquired lease my-system/9f7554c3.opentelemetry.io
{"level":"info","ts":"2024-02-15T16:15:12Z","logger":"collector-upgrade","msg":"looking for managed instances to upgrade"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1alpha1.OpenTelemetryCollector"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.ConfigMap"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.ServiceAccount"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.Service"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.Deployment"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.DaemonSet"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.StatefulSet"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v2.HorizontalPodAutoscaler"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","source":"kind source: *v1.PodDisruptionBudget"}
{"level":"info","ts":"2024-02-15T16:15:12Z","logger":"instrumentation-upgrade","msg":"looking for managed Instrumentation instances to upgrade"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting Controller","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1alpha1.OpAMPBridge"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.ConfigMap"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.ServiceAccount"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.Service"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting EventSource","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","source":"kind source: *v1.Deployment"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting Controller","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge"}
{"level":"info","ts":"2024-02-15T16:15:12Z","logger":"collector-upgrade","msg":"no instances to upgrade"}
{"level":"info","ts":"2024-02-15T16:15:12Z","logger":"instrumentation-upgrade","msg":"no instances to upgrade"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting workers","controller":"opampbridge","controllerGroup":"opentelemetry.io","controllerKind":"OpAMPBridge","worker count":1}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Starting workers","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","worker count":1}
{"level":"error","ts":"2024-02-15T16:15:12Z","logger":"controllers.OpenTelemetryCollector","msg":"couldn't parse the endpoint's port","endpoint":"prometheus","error":"port should not be empty","stacktrace":"github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/parser/exporter.singlePortFromConfigEndpoint\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/parser/exporter/exporter.go:73\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/parser/exporter.(*PrometheusExporterParser).Ports\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/parser/exporter/exporter_prometheus.go:63\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/adapters.ConfigToComponentPorts\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/adapters/config_to_ports.go:107\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/adapters.ConfigToPorts\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/adapters/config_to_ports.go:132\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.getConfigContainerPorts\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/container.go:172\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Container\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/container.go:46\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Deployment\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/deployment.go:56\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build.FactoryWithoutError[...].func1\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/builder.go:31\ngithub.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/collector.go:71\ngithub.com/open-telemetry/opentelemetry-operator/controllers.BuildCollector\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/common.go:54\ngithub.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/opentelemetrycollector_controller.go:124\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:227"}
{"level":"info","ts":"2024-02-15T16:15:12Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"opentelemetrycollector","controllerGroup":"opentelemetry.io","controllerKind":"OpenTelemetryCollector","OpenTelemetryCollector":{"name":"opentelemetry-collector-management-internet","namespace":"my-system"},"namespace":"my-system","name":"opentelemetry-collector-management-internet","reconcileID":"283b9ac7-3665-4299-a9d8-67ef904ced7b"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1751325]

goroutine 412 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
        /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x2c27860?, 0x52f65a0?})
        /opt/hostedtoolcache/go/1.21.6/x64/src/runtime/panic.go:914 +0x21f
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/parser/exporter.(*PrometheusExporterParser).Ports(0x3999788?)
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/parser/exporter/exporter_prometheus.go:63 +0x45
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/adapters.ConfigToComponentPorts({{0x3999788?, 0xc000642ae0?}, 0xc0012d91a0?}, 0x1, 0xc0009f5400?)
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/adapters/config_to_ports.go:107 +0x708
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector/adapters.ConfigToPorts({{0x3999788?, 0xc000642ae0?}, 0xc001582bb8?}, 0x267ac17?)
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/adapters/config_to_ports.go:132 +0x90
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.getConfigContainerPorts({{0x3999788?, 0xc000642ae0?}, 0x3?}, {0xc0004afc00, 0x374})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/container.go:172 +0xa5
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Container({{0x3968240, 0xc00088cb90}, {{0x3999788, 0xc0007bd2c0}, 0x0}, {0xc0007b01e0, 0x45}, {0xc0007b0230, 0x4a}, {0xc0007b0320, ...}, ...}, ...)
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/container.go:46 +0xb6
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Deployment({{0x39a5d00, 0xc00040f5f0}, {0x39924e0, 0xc0006d1980}, 0xc0001c6230, {{0x3999788, 0xc000642ae0}, 0x0}, {{{0x27bed39, 0x16}, ...}, ...}, ...})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/deployment.go:56 +0x2a6
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build.FactoryWithoutError[...].func1()
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/builder.go:31 +0x44
github.com/open-telemetry/opentelemetry-operator/internal/manifests/collector.Build({{0x39a5d00, 0xc00040f5f0}, {0x39924e0, 0xc0006d1980}, 0xc0001c6230, {{0x3999788, 0xc000642ae0}, 0x0}, {{{0x27bed39, 0x16}, ...}, ...}, ...})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/internal/manifests/collector/collector.go:71 +0xa3d
github.com/open-telemetry/opentelemetry-operator/controllers.BuildCollector({{0x39a5d00, 0xc00040f5f0}, {0x39924e0, 0xc0006d1980}, 0xc0001c6230, {{0x3999788, 0xc000642ae0}, 0x0}, {{{0x27bed39, 0x16}, ...}, ...}, ...})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/common.go:54 +0x10a
github.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile(0xc00067ba20, {0x3994780, 0xc0012e8390}, {{{0xc000c56810, 0xb}, {0xc000db2cc0, 0x2b}}})
        /home/runner/work/opentelemetry-operator/opentelemetry-operator/controllers/opentelemetrycollector_controller.go:124 +0x44b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x3999788?, {0x3994780?, 0xc0012e8390?}, {{{0xc000c56810?, 0xb?}, {0xc000db2cc0?, 0x0?}}})
        /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000aaefa0, {0x39947b8, 0xc0005caeb0}, {0x2e0da00?, 0xc0006d5700?})
        /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:316 +0x3cc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000aaefa0, {0x39947b8, 0xc0005caeb0})
        /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:266 +0x1af
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
        /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 199
        /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:223 +0x565

Kubernetes Version

v1.28.3-gke.1286000

Operator version

0.93.0

Collector version

n/a

Environment information

Environment

Kubernetes: GKE

Log output

No response

Additional context

No response

pavolloffay commented 7 months ago
  exporters:
      prometheus:
        endpoint: prometheus

The endpoint in the CR is not valid .

pavolloffay commented 7 months ago

We should also fix the operator to avoid panic.

dexter0195 commented 7 months ago

Hey ! I would love to help on this, I was having a look at those files :

I see the problem could come from the fact that if singlePortFromConfigEndpoint is not able to parse the url then it returns nil. What behaviour would you recommend in case the configuration is not correct ?

CLIN42 commented 7 months ago

@dexter0195 To avoid panic, it makes sense to check at here if return of singlePortFromConfigEndpoint is nil, only append if it's not

    prometheusPort := singlePortFromConfigEndpoint(o.logger, o.name, o.config)
    if prometheusPort != nil {
    ports = append(ports, *prometheusPort)
    }
jaronoff97 commented 6 months ago

this has been closed by #2653 thanks for reporting!