open-telemetry / opentelemetry-operator

Kubernetes Operator for OpenTelemetry Collector
Apache License 2.0
1.18k stars 422 forks source link

When using multiple collector instances, it's impossible to collect self metrics from all collectors. #3099

Open lzpfmh opened 3 months ago

lzpfmh commented 3 months ago

Component(s)

target allocator

What happened?

Description

When configuring multiple collectors in a K8s cluster, with the following configurations, in practice, metrics can only be collected from only one collector node.

Steps to Reproduce

config: | extensions: pprof:
receivers: prometheus: config: scrape_configs:

  • job_name: 'nonk8s-otel-collector' scrape_interval: 10s static_configs:
  • targets: ['0.0.0.0:8888']

the target 0.0.0.0:8888 will be assigned to only one collector node, other collector nodes do not gather their own metrics.

Expected Result

To get metric data exposed by all collectors on port 8888.

Actual Result

Metrics from only one collector can be obtained.

Kubernetes Version

1.26

Operator version

v0.104.0

Collector version

v0.104.0

Environment information

Environment

OS: Linux master1 5.15.13-1.el7.elrepo.x86_64 #1 SMP Tue Jan 4 17:33:28 EST 2022 x86_64 x86_64 x86_64 GNU/Linux

Log output

No response

Additional context

No response

jaronoff97 commented 3 months ago

there's no fix in the target allocator for this unfortunately. Because that address is hardcoded and not dynamic (i.e. using a downward API reference) the TA wouldn't be able to assign this effectively. You could either move to use a downward API reference for the collector's own IP address or change to use the observability features provided by the operator to scrape metrics via a servicemonitor.