Open SylvainJuge opened 1 year ago
Hi @SylvainJuge, thanks for starting the initiative to gather implementations for all cloud providers.
Once all the resource detectors are added to the contrib repository, what will be the recommended way for users of the Java agent to incorporate these detectors? Naively, I would expect the workflow to look like:
-Dotel.javaagent.extensions
This seems pretty inconvenient, especially because they require the use of advanced features of Maven/Gradle to automate.
I know that there have been earlier discussions about incorporating detectors for common cloud platforms into the default agent distribution. In at least one case, we decided to exclude the detector because it added startup latency. I was wondering if we could get the best of both worlds by
This would parallel the approach used in the Collector, where the contrib distribution includes many detectors, but a given detector is only invoked if it's explicitly enabled in the Collector configuration file.
Hi @punya , sorry for the late reply on this.
So far I haven't really thought about "making it convenient to use them", but that's a very good point here. I agree with you that shading or using the command line option is not really practical for most users and doing that for every agent distribution would be wasteful.
Having them included and disabled by default in the agent would definitely be a good option:
otel.java.enabled.resource.providers
config (doc), or by removing them from otel.java.disabled.resource.providers
depending on how we implement the "disabled by default" (see below).In order to implement the "included but disabled by default", what we did on our side so far is the following:
This strategy is complex and can´t be reused when using those resource providers directly as SDK extensions.
On the code side, I think that keeping it in the contrib repo and not directly into the agent allows to reuse them as SDK extensions without an agent, but in practice I really don't know how popular or how relevant this option would be. Given support for java agents in native images like GraalVM is clearly not for the short term that's still something to keep in mind.
So here I would be in favor of keeping the code in contrib repo and add them (but disabled) in the agent. However I am not 100% clear about is what would be the best option to implement the "included but disabled by default" behavior:
otel.java.{enabled,disabled}.resource.providers
options are currently defined in the SDK autoconfiguration (code), both are empty by default.I think that we might need to have an agent-only configuration option here to implement the opt-in behavior, as we can't alter the semantics of the existing SDK autoconfig options, for example otel.instrumentation.optional.resource.providers
. The agent would contain an hard-coded list of included FQN optional providers and unless their FQN is added to this option those would be added at agent startup to the otel.java.disabled.resource.providers
by the agent.
I was wondering if we could get the best of both worlds by
- Including the detector code in the agent
- Keeping it disabled by default
this makes sense to me
On the code side, I think that keeping it in the contrib repo and not directly into the agent allows to reuse them as SDK extensions without an agent, but in practice I really don't know how popular or how relevant this option would be.
even if we moved them to the instrumentation repo, we would still publish them as standalone artifacts, e.g. https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/resources/library
but I agree with keeping them here in the contrib repo where the cloud vendors can have ownership of them, and we can still pull them into the Java agent.
otel.instrumentation.optional.resource.providers
I think this is a good idea :smile:
- why use a FQN here instead of the short names as for other providers?
What I meant here is that we should the same values as the ones we can use with the otel.java.{enabled,disabled}.resource.providers
options as the agent will probably copy/append/modify the provided values to those existing SDK options.
I wasn't aware of the "short names" that we can use with other providers, is there any documentation or list of them somewhere ? Currently the SDK documentation only refers to FQN.
Sorry, confused that with exporter...
I was wondering if we could get the best of both worlds by
- Including the detector code in the agent
- Keeping it disabled by default
this makes sense to me
@trask what about otel.java.additional.resource.providers=<FQN1,FQN2>
to enable resource providers, without affecting resource providers that are not mentioned in this list.
something similar to otel.instrumentation.<>.enabled=true
? (could be done entirely in the agent, without impacting resource providers themselves)
something similar to otel.instrumentation.<>.enabled=true? (could be done entirely in the agent, without impacting resource providers themselves)
I was wondering if this makes sense given that resource providers can also be used without the otel java agent. Would a user using the resource providers as library instrumentation expect to have the notion of default enabled providers? I think the answer is no. Users have to manually add a dependency on the resource provider, and it makes sense to interpret this as wanting to enable that resource provider by default. In contrast, when the agent is installed, (most) users don't have a say on which resource providers are included, so it makes sense to have an additional configuration knob.
If something like otel.instrumentation.<>.enabled=true
was introduced, we could:
otel.instrumentation.<>.enabled=true
for all resource provider instrumentations and use it to customize the otel.java.disabled.resource.providers
option.
otel.java.disabled.resource.providers
otel.java.disabled.resource.providers
Would a user using the resource providers as library instrumentation expect to have the notion of default enabled providers? I think the answer is no.
In the case of a spring boot starter it would also make sense - but I think it doesn't change the proposed solution.
If I understand the proposal correctly, it could be implemented with a new NamedResourceProvider
in the SDK
NamedResourceProvider:
otel.instrumentation.gcp-resources.enabled=true
Suggestions
otel.java.resource.provider.<>.enabled
to align with the existing providers
otel.resource.provider.<>.enabled
if we use gcp
instead of FQN@trask @jack-berg I've created a PR that implements this proposal: https://github.com/open-telemetry/opentelemetry-java/pull/6250
@trask here's the ticket for the Azure resource provider: https://github.com/open-telemetry/opentelemetry-java-contrib/issues/1214
Most cloud providers provide a metadata endpoint that allows to build resource information, however in Java contrib repo we only have an implementation for AWS.
For example, in the js contrib repo, we can see there are other implementations in https://github.com/open-telemetry/opentelemetry-js-contrib/tree/main/detectors/node : alibaba, gcp and aws (I haven't looked at their respective implementations though).
The goal here is to add implementations for the most common cloud providers.
Initially the focus will be on the following cloud providers: AWS, GCP and Azure with the following task breakdown
Other cloud providers can of course be added later, but should be tracked independently.
Collector implementations that can be used for reference : https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourcedetectionprocessor
for triage: This issue can be assigned to me.