open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.96k stars 859 forks source link

Supports External Control of Agent Behavior Dynamically #12251

Open pepeshore opened 1 month ago

pepeshore commented 1 month ago

Is your feature request related to a problem? Please describe.

When a sudden surge in business traffic causes an application to reach its performance limits, we often need to downgrade some functions of the mounted agent to ensure business availability. Currently, the configuration of the agent requires a restart to take effect, and some behaviors of the probes cannot be dynamically managed. This poses significant risks in real-world enterprise scenarios.

Describe the solution you'd like

We hope to dynamically control the following behaviors of the agent from the external:

Describe alternatives you've considered

No response

Additional context

No response

jackshirazi commented 1 month ago

I've been looking at these. It will gradually become more dynamic but never fully. In terms of your top three there

pepeshore commented 1 month ago

I noticed that the community has been discussing OpAMPs recently. Is it possible to use them to achieve this functionality?

pepeshore commented 1 month ago

https://docs.google.com/document/d/1WK9h4p55p8ZjPkxO75-ApI9m0wfea6ENZmMoFRvXSCw/edit

jackshirazi commented 1 month ago

The short answer is OpAMP doesn't help.

The long answer is that OpAMP and dynamic capability are not tied. You can have the above dynamic capability without OpAMP. Conversely adding in OpAMP does not require dynamic capabilities in the agent or SDK. Current thinking is that with OpAMP you have a few possibilities:

Using OpAMP to get configuration changes is probably going to be done in a proprietary way for now because the OpAMP specification doesn't require any particular config structure or mechanism. It may evolve to recommend the use of declarative configuration, but that doesn't make it any more supportive of dynamic behaviour