grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.05k stars 111 forks source link

Support for OpAMP to manage fleet of Alloy agents #1236

Open X4mp opened 2 weeks ago

X4mp commented 2 weeks ago

Request

Hi, It would be a great addition to the Grafana ecosystem if it would be possible to manage a fleet of Alloy agents via the Open Agent Management Protocol (OpAMP). Being able to have a central management interface to manage all agents and their respective configurations would reduce the operational cost of running dozens or hundreds of Alloy agents across a large infrastructure.

Use case

We want to manage hundreds of telemetry agents via a central agent management service. This OpAMP server should allow to list all deployed agents and their active configuration as well as the bandwidth for ingested data to the Metric TSDB. A reference for this would be Splunk's Deployment Server or ObservIQ Bindplane. Latter one is a server implementation of OpAMP, but only provides support for their own telemetry collector.

Without a central management service, we are bound to Ansible to rollout and manage our agents. This works, but doesn't provide the same operational quality to immediately see the status of all agents and troubleshoot problems.

tpaschalis commented 2 weeks ago

Hey there @X4mp 👋

This is on our radar. We're experimenting the addition remote configuration/fleet management capabilities to Alloy with the remotecfg block.

As you've probably seen, the scope of OpAmp is really vast, so to to get us quickly off the ground we're using a custom Open Source protocol available over at https://github.com/grafana/alloy-remote-config. It only contains a single GetConfig endpoint right now, but will be a stepping stone to properly evaluating OpAmp for production workloads.

X4mp commented 1 week ago

Hey @tpaschalis That is incredible to hear. Thank you very much for being open and providing this understandable response. Indeed have I seen the scope of OpAmp and I fully support your decision of going with the remote-config route. Even without full OpAmp support at the beginning, this will be a super important feature for us to remotely manage our Alloy agents!

Can you provide us with a rough estimate when we can expect the first parts being implemented and released with Alloy? Really looking forward to this!

Best regards, Roland