open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.74k stars 2.18k forks source link

Easy scaling when using non push based receivers #32869

Open gouthamve opened 2 months ago

gouthamve commented 2 months ago

Component(s)

No response

Is your feature request related to a problem? Please describe.

When using the OTel Collector receivers that are not push based, scaling out the Collectors becomes complicated.

For example, if you are using a mysqlreceiver, and scale up the Collector replicas to 2, then you’ll end up collecting the same metrics twice.

To handle this, we need to have multiple collector deployments, one with the receivers and one with just OTLP receiver. And when a single Collector cannot handle the load from the receivers, you need to then split the receivers into multiple receivers manually.

Describe the solution you'd like

A solution like the target allocator which automatically spreads the receivers within a cluster and makes sure that only one instance of a receiver is running at any one moment.

Describe alternatives you've considered

Config management to scale things out. But this is not easy to build or maintain.

Additional context

No response

jaronoff97 commented 2 months ago

This is a great idea overall, I've had similar thoughts about the k8s cluster receiver. Through a few discussions with @swiatekm-sumo we were thinking it would be best if the collector had generic support for a hash or shard key that the operator could automatically fill in. This would make it easier for receiver authors to take advantage of sharding when present.

We could also look into more generic target support in the target allocator, but I worry that not all receivers want to separate their concerns in that way. Prometheus' native discovery mechanism is one thats easy to act as a middleman for, however, most receivers do not have that same type of discovery and work off of API calls instead. For endpoint based receivers, we could conceivably have the target allocator work for them by having the calls proxy through the TA. However, this may require the TA to import collector components which would result in a bad cycle. I'd love to hear other people's thoughts here.

jpkrohling commented 2 months ago

Related: https://github.com/open-telemetry/opentelemetry-collector/issues/8341

github-actions[bot] commented 1 week ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.