signalfx / splunk-otel-collector-chart

Splunk OpenTelemetry Collector for Kubernetes
Apache License 2.0
119 stars 148 forks source link

AKS - splunk-otel-collector-agent failed to start on AKS cluster #746

Closed aqilbeig closed 4 months ago

aqilbeig commented 1 year ago

What happened?

Description

splunk-otel-collector-agent failed to start on AKS cluster

Steps to Reproduce

Expected Result

Actual Result

Chart version

v0.4.0

Environment information

Environment

Cloud: (e.g., "EKS", "AKS", "GKE") : AKS k8s version: (e.g., 1.21.0) : 1.24.9 OS: (e.g., "Ubuntu 20.04") : Ubuntu 18 LTS

Chart configuration

No response

Log output

2023/04/19 09:07:39 settings.go:331: Set config to [/conf/relay.yaml]
2023/04/19 09:07:39 settings.go:384: Set ballast to 165 MiB
2023/04/19 09:07:39 settings.go:400: Set memory limit to 450 MiB
2023-04-19T09:07:39.865Z    info    service/telemetry.go:90 Setting up own telemetry...
2023-04-19T09:07:39.866Z    info    service/telemetry.go:116    Serving Prometheus metrics  {"address": "0.0.0.0:8889", "level": "Basic"}
2023-04-19T09:07:39.867Z    info    kube/client.go:101  k8s filtering   {"kind": "processor", "name": "k8sattributes", "pipeline": "logs", "labelSelector": "", "fieldSelector": "spec.nodeName=aks-system3-16418380-vmss00003p"}
2023-04-19T09:07:39.867Z    info    memorylimiterprocessor@v0.75.0/memorylimiter.go:113 Memory limiter configured   {"kind": "processor", "name": "memory_limiter", "pipeline": "logs", "limit_mib": 450, "spike_limit_mib": 90, "check_interval": 2}
2023-04-19T09:07:39.868Z    info    service/service.go:129  Starting otelcol... {"Version": "v0.75.0", "NumCPU": 16}
2023-04-19T09:07:39.868Z    info    extensions/extensions.go:41 Starting extensions...
2023-04-19T09:07:39.868Z    info    extensions/extensions.go:44 Extension is starting...    {"kind": "extension", "name": "health_check"}
2023-04-19T09:07:39.868Z    info    healthcheckextension@v0.75.0/healthcheckextension.go:45 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Endpoint":"0.0.0.0:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2023-04-19T09:07:39.868Z    info    service/service.go:155  Starting shutdown...
2023-04-19T09:07:39.868Z    info    healthcheck/handler.go:129  Health Check state change   {"kind": "extension", "name": "health_check", "status": "unavailable"}
2023-04-19T09:07:39.868Z    info    extensions/extensions.go:55 Stopping extensions...
2023-04-19T09:07:39.868Z    info    zpagesextension@v0.75.0/zpagesextension.go:109  Unregistered zPages span processor on tracer provider   {"kind": "extension", "name": "zpages"}
2023-04-19T09:07:39.868Z    info    service/service.go:169  Shutdown complete.
Error: failed to start extensions: failed to bind to address 0.0.0.0:13133: listen tcp 0.0.0.0:13133: bind: address already in use; failed to shutdown pipelines: no existing monitoring routine is running
2023/04/19 09:07:39 main.go:103: application run finished with error: failed to start extensions: failed to bind to address 0.0.0.0:13133: listen tcp 0.0.0.0:13133: bind: address already in use; failed to shutdown pipelines: no existing monitoring routine is running

No response

Additional context

No response

jvoravong commented 1 year ago

Hey @aqilbeig, Are you by chance installing multiple instances of the chart in your cluster? If so, check out https://github.com/signalfx/splunk-otel-collector-chart/issues/572.

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

atoulme commented 1 year ago

Thanks @jvoravong @aqilbeig please let us know if you're still hitting this issue. I will close this issue soon as inactive.