Open brunobat opened 7 months ago
I'm not sure about the config collisions part because it seems to me that if we don't allow env. var. configs to particular configurations, it seems to me it will be dificult to find unique keys to identify all the property names in the PropertiesSupplier
Map.
The programatic interface in the Java SDK is the properties supplier defined in the AutoConfigureOpenTelemetrySdkBuilder as Map<String, String>> propertiesSupplier
Incorrect. The programmatic interface are the builders for the specific components: BatchSpanProcessor, OtlpGrpcSpanExporter, Resource, PeriodicMetricReader, etc. The autoconfigure module is an abstraction which interprets the environment variable configuration scheme and configures the SDK using the programmatic configuration interfaces of individual components.
The file configuration MUST provide a map of properties to this supplier and MUST NOT override the existing auto-configuration interfaces, namely, if a configuration file is provided, it MUST NOT take precedence over the provided programatic configurations, oTelConfigs, in this example:
You're using normative language here, but its not taken from the specification. I assume you're expressing a strong opinion then?
If a source is not able to provide an unambiguous value for a particular configuration value, that configuration will be unavailable in that source. This behaviour must be documented and a default value must be provided. Example:
In this case it is not possible to use the env vars to configure 2 different exporters and they will end up with the same address. This must be documented. In the future, if required, support for this could be added by implementing new env. vars. It should be noted however that frameworks integrating OpenTelemetry could find a solution of their own for this problem.
I'm not sure what you're trying to convey here. The example you include is taken out of context. I wrote that example because Diego and I were talking about how some SDK implementations directly embed interpretation of environment variables into the components (in contrast to opentelemtry-java which extracts that to the separate autoconfigure artifact). We were discussing that example wondering how those other SDK implementations handle the types of environment variable conflicts today.
According to the OTel spec, A configuration must not be exclusive of a particular configuration source, namely the file configuration.
What spec language are you referencing here?
By principle, libraries are much better off not forcing a specific way of configuration on users, but let those decisions be driven by frameworks
Whether or not file configuration supports environment variable overrides (still not decided and being actively debated), there would be nothing forcing a user or a framework to use file configuration, or the environment variable scheme for that matter. A user or framework will always be able to simply not use the autoconfigure module and use their own configuration scheme with the low level programmatic APIs.
The programatic interface in the Java SDK is the properties supplier defined in the AutoConfigureOpenTelemetrySdkBuilder as Map<String, String>> propertiesSupplier
Incorrect. The programmatic interface are the builders for the specific components: BatchSpanProcessor, OtlpGrpcSpanExporter, Resource, PeriodicMetricReader, etc. The autoconfigure module is an abstraction which interprets the environment variable configuration scheme and configures the SDK using the programmatic configuration interfaces of individual components.
We should clarify this, because that is not my understanding. The properties supplier is generic enough and not dependent of any "environment variable" in order to work, if entries come from env. vars. or any other place, the map doesn't care. On a broader note, Will this mean that, if I want to stay independent of additional configuration methods and I want to control the configuration, I will need to rewrite the bootstrap of the OTel SDK in order to avoid the Autoconfiguration? Who should use the Autoconfiguration, then?
The file configuration MUST provide a map of properties to this supplier and MUST NOT override the existing auto-configuration interfaces, namely, if a configuration file is provided, it MUST NOT take precedence over the provided programatic configurations, oTelConfigs, in this example:
You're using normative language here, but its not taken from the specification. I assume you're expressing a strong opinion then?
This part is a proposal. Open for debate.
According to the OTel spec, A configuration must not be exclusive of a particular configuration source, namely the file configuration.
What spec language are you referencing here?
Here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-configuration.md In here, the programatic interface is the foundation of all other configuration methods or sources and all Configuration methods are on par in terms of priority. We can argue if priority should be introduced, though, but that's being discusses here: https://github.com/open-telemetry/opentelemetry-specification/issues/3752
We definitely need to clarify and define somewhere what is considered the programatic interface in the case of Java.
By principle, libraries are much better off not forcing a specific way of configuration on users, but let those decisions be driven by frameworks
Whether or not file configuration supports environment variable overrides (still not decided and being actively debated), there would be nothing forcing a user or a framework to use file configuration, or the environment variable scheme for that matter. A user or framework will always be able to simply not use the autoconfigure module and use their own configuration scheme with the low level programmatic APIs.
True, if frameworks weren't using already the Autoconfiguration to create an SDK. This effectively means a breaking change for existing systems and a message to not use the Autoconfiguration.
Can you please clarify what problem this issue is trying to solve? I read the entire issue and couldn't figure out the pain points the issue is trying to solve.
From my understanding, the programmatic interface is the *Builder classes, allowing you to build OpenTelemetrySdk
and everything it needs: MetricReader, View, Exporter, etc.
AutoConfiguredOpenTelemetrySdkBuilder
Allows you to build the SDK entirely from environment variables, system properties, and configuration files. It also allows you to modify any part configured from those sources before it's built.
The only part that needs to be added is the ability to merge the configuration you got from the environment variables and system properties with the configuration from the files (views and general config). Today, it's either from configuration files or from environment variables and system properties. That is what I would state as a proposal/problem to solve.
I read the SDK specification you linked, and it answers that precisely. I have to say the design @jack-berg made here is brilliant - elegant and clean. It's x10 better than other libraries I've used, being metric libraries (Dropwizard, Micrometer, Prometheus Java client) or even general-purpose libraries.
@asafm these are some of the points to clarify:
I don't question the merits of the tech solution proposed by @jack-berg. His work is an example to all of us. I think the file configuration "epic" should have been completed to a reasonable degree at the spec level and also discussed under a design issue to asses the impacts in the existing user base. This issue tries to fuel this discussion.
CC @kittylyst
I think they were trying as much as possible not to be overly specific in the spec, as it can create havoc later. Okay, let's work through your answers to the questions to get to the bottom of the problems you see.
- Can you please clarify or give an example of what exactly are Framework integrators? Maybe a specific use case? I need to understand why you need a set of standard properties.
Quarkus, Wildfly, OpenLiberty, Payara, Spring, etc...
- Regarding the documentation you're asking for, I defer this to @jack-berg.
- I need a concrete example to understand your use case.
These are the Autoconfigure properties: https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/autoconfigure/README.md#general-configuration
If I cannot use the Autoconfigure I'm free to define my own as I please, like:
Instead of otel.exporter.otlp.endpoint
I could very well define telemetry.out.cannel1.endpoint
.
This would totally break portability.
I'm totally in favor of portability, but I don't want to reimplement something that is already done in Autoconfigure.
- "The Autoconfiguration is being changed i" - changed by who? I don't understand.
Changed on the PR mentioned in the description. The code in here As you can see, if you have a file configuration, the current Autoconfiguration is ignored.
- No No. The whole idea is that the programmatic interface is the basic building block. It's typed and documented. It is straightforward to use in many bespoke cases. The auto-configuration is essentially a translation layer from the config to the method calls of the builders (the programmatic API). To me, it's very standard architecture.
I disagree because of the complexity of setting up an SDK. Have you try to do it without the Autoconfigure for a real world app (Multithreaded+REST+DB+heath+Security+Resources+Dependency Injection+etc) ? Ths OTel SDK is more complex than most HTTP servers and all of them provide some higher level abstraction... You don't need to build your own thread pool in order to use it. Sensible defaults for crosscutting functionalities should be in place. This includes standard configuration, which implies the use of the Autoconfigure.
- Can you please elaborate more on this? What exactly is the problem of merging? Maybe specify an example so I can understand.
You have the same property with different values in the file and on an env. var. Who wins?
I think they were trying as much as possible not to be overly specific in the spec, as it can create havoc later. Okay, let's work through your answers to the questions to get to the bottom of the problems you see.
The properties supplier is generic enough and not dependent of any "environment variable" in order to work, if entries come from env. vars. or any other place, the map doesn't care.
No its not generic enough. There are many things that cannot be expressed using flat properties. For example, non-trivial processor configurations, views, and non-trivial sampler configurations. The customization options in autoconfiguration are largely in direct response to inadequate expressive power of the flat scheme we have today.
On a broader note, Will this mean that, if I want to stay independent of additional configuration methods and I want to control the configuration, I will need to rewrite the bootstrap of the OTel SDK in order to avoid the Autoconfiguration? Who should use the Autoconfiguration, then?
I think that's a decision frameworks will have to make:
This decision should be analogous to how frameworks handle log configuration files today:
Why shouldn't the story with opentelemetry be the same?
We definitely need to clarify and define somewhere what is considered the programatic interface in the case of Java.
This is not ambiguous. The module description is:
Autoconfigure OpenTelemetry SDK from env vars, system properties, and SPI
The first line of the autoconfigure readme (present since > 3 years ago) states:
This artifact implements environment-based autoconfiguration of the OpenTelemetry SDK. This can be an alternative to programmatic configuration using the normal SDK builders.
True, if frameworks weren't using already the Autoconfiguration to create an SDK. This effectively means a breaking change for existing systems and a message to not use the Autoconfiguration.
Incorrect. There is no behavior change when the input (i.e. environment variables and system properties and SPIs) is the same. The conditions to trigger a change are explicitly setting otel.config.file
and explicitly including the opentelemetry-sdk-extension-incubator
dependency. To consider that breaking to consider the addition of any new property breaking, with is not aligned with industry standards. Furthermore, customization options exist that allow frameworks to nullify any user attempt to set otel.config.file
.
This genie can't be put back in the bottle. The OpenTelemetry community is committed to having a file format. The TC put a moratorium on enhancements to the env variable scheme, a dedicated working group was spun up, an OTEP was long debated and approved, a bunch of work has been debated and merged to the spec regarding file configuration, and there are prototypes in at least 3 languages. The wheels are in motion - the opentelemetry community will not be limited to a flat env var / system property style configuration scheme.
There are open questions about how file configuration interacts with the environment variable config scheme, and for java specifically, its appropriate to debate how file configuration should interact with autoconfigure tooling. But file configuration is happening, and over time its likely to supplant the environment variable scheme as the dominant configuration mechanism.
With that aide, let's get to the crux of what I think we ought to care about, which is how exactly file configuration should interact with the autoconfigure mechanism.
I've uploaded two diagrams to help talk through things.
Here we see how autoconfigure works without file configuration. There's a merging of different sources which contribute to ConfigProperties. The ConfigProperties are read from in the process of configuring Resource, TracerProvider, MeterProvider, LoggerProvider, and Propagators. There are SPIs for providing custom plugin components (i.e. exporters, samplers, etc), which read from ConfigProperties. There are SPIs for customizing the result of any of these plugin components, and further for customizing TracerProvider, MeterProvider, and LoggerProvider.
Note the many hooks in place that are essentially escape hatches for the lack of expressiveness of the flat configuration scheme. We can't easily wrap one exporter in another, so we allow exporter customizers, we can't register views, so we expose a MeterProvider customizer, etc, etc.
Here's my current working model on how file configuration interacts with this. We still resolve ConfigProperties from multiple contributing sources, but we have a key fork point in the flow based on whether otel.config.file
is defined. Note that customizers contributing to ConfigProperties can influence this fork. If otel.config.file
is not set, continue as we do today. if otel.config.file
is set, we go to a different path: We parse the contents of the config file to an in-memory configuration model. We have a phase where the configuration model is customized:
We interpret the resolved model, invoking any of the plugin extension SPIs to provide exporters, samplers, processors which are not built-in.
This seems well balanced to me. Existing usages of autoconfigure would continue working as normal, but we have a toolkit to evolve to a more mature solution where we're able to encode complex configuration in a structure configuration model. We continue our practice of providing customization hooks.
Note that once we have a configuration model, we have a reliable and simpler alternative for frameworks to use their own configuration mechanism compared with environment configuration OR the programmatic interface. A framework use synthesize its own configuration sources into a configuration model, and be able to reliably configure an SDK from that model.
Thanks @jack-berg this makes much more sense now.
I think that's a decision frameworks will have to make
Are you aware of any java framework that starts the SDK without the Autoconfiguration? Is this really a practical choice or even something that we want to incentivise?
Why shouldn't the story with opentelemetry be the same?
The files can be used but also other methods. In Quarkus those files are not needed.
This is not ambiguous. The module description is:
Autoconfigure OpenTelemetry SDK from env vars, system properties, and SPI
But it doesn't mention file config, does it?... This was not under the radar when the Autoconfiguration was created, I acknowledge.
Let me suggest a design alternative to highlight my pain points. Ideally, from my point of view, There should be retro-compatibility (which seems assured) but also a path forward for existing systems to use the new configuration model without massive rewrites. I imagine some of the current functionality and all future features will use only the new config model.
We could rearrange things in a way that allows existing use but also provides a path forward to existing Autoconfiguration users to use the new, richer model.
You can argue that current configurations will not map perfectly to the new model, but does this mapping have to be perfect? Maybe this transformation can happen painlessly on 90% plus of the cases, no?
I haven't been following all the details here, but one thing that is non-negotiable... everything must be configurable, by an end-user writing code using public APIs. We cannot have "hidden", non-public methods for configuring the SDK that are only accessible via file or env-var configuration.
Are you aware of any java framework that starts the SDK without the Autoconfiguration?
Spring, and specifically the opentelemetry spring boot starter.
I don't have a list of all the frameworks that integrate with OpenTelemetry and how they do configuration.
The files can be used but also other methods.
The same is true here.
But it doesn't mention file config, does it?... This was not under the radar when the Autoconfiguration was created, I acknowledge.
It doesn't need to.
All of file configuration can be implemented as an AutoConfigurationCustomizerProvider. Nothing needs to be built into the autoconfigure module for file configuration to completely override the autoconfiguration output. I just need to implement a customizer with a high order number, and replace the SdkTracerProviderBuilder
, SdkMeterProviderBuilder
, SdkLoggerProvider
builder.
I imagine some of the current functionality and all future features will use only the new config model. We could rearrange things in a way that allows existing use but also provides a path forward to existing Autoconfiguration users to use the new, richer model.
Can you elaborate? I can't tell what you mean by this.
Also see this document explaining some of the original thinking behind SDK configuration design: https://github.com/open-telemetry/opentelemetry-java/blob/main/docs/sdk-configuration.md#goals-and-non-goals
Note that all the emphasis is on the builders as the configuration mechanism.
Note that listed as a "non-goal" is:
Make sure everything is auto-configurable. This is out of the scope of the SDK, and instead is left to auto-configuration layers, which are also described below but not as part of the core SDK. The SDK provides an autoconfiguration extension as an option which is not internal to the main SDK components.
I imagine some of the current functionality and all future features will use only the new config model. We could rearrange things in a way that allows existing use but also provides a path forward to existing Autoconfiguration users to use the new, richer model.
Can you elaborate? I can't tell what you mean by this.
If the rich file model allows configurations that are not possible with env vars, would this mean some of the configurations could only be performed using the file?
If the rich file model allows configurations that are not possible with env vars, would this mean some of the configurations could only be performed using the file?
This is a loaded question.
Whether or not all components of a configuration file are representable with env vars is currently being debated. As discussed in this comment, if we merge the existing environment variable scheme its unlikely everything in file configuration will be representable. But if we ignore the existing environment variable scheme and invest a new one with names that are derived from the configuration model, then its likely that everything in file configuration will be representable with environment variables. If this is important to you then I suggest you go advocate for that.
But I want to emphasize that the file configuration mechanism comes paired with a configuration model, which is produced as a result of parsing a configuration file, but which can also be programmatically constructed or edited.
But I want to emphasize that the file configuration mechanism comes paired with a configuration model, which is produced as a result of parsing a configuration file, but which can also be programmatically constructed or edited.
This sounds good to me in principle, but the current discussion seems to be file centric and requiring a file to work. At least in the Autoconfigure it currently requires a file to work.
Could we get a method were we send not a file but a configuration object to create the SDK in the same fashion the file does?
Could we get a method were we send not a file but a configuration object to create the SDK in the same fashion the file does?
Is this what you are looking for, or something slightly different?
https://github.com/open-telemetry/opentelemetry-java/issues/6170#issuecomment-1908873798
- We provide a new SPI callback hook allowing arbitrary customization of the configuration model
Could we get a method were we send not a file but a configuration object to create the SDK in the same fashion the file does?
Is this what you are looking for, or something slightly different?
- We provide a new SPI callback hook allowing arbitrary customization of the configuration model
I need to see the details, but that could work.
This sounds good to me in principle, but the current discussion seems to be file centric and requiring a file to work. At least in the Autoconfigure it currently requires a file to work. Could we get a method were we send not a file but a configuration object to create the SDK in the same fashion the file does?
See the file configuration spec requirement for a configuration model, and for a method called create which accepts a configuration model and returns configured SDK components.
The java embodiment of the model is a (currently generated) class called OpenTelemetryConfiguration
, which this create method accepts.
The idea behind an SPI would be something like:
public interface ConfigurationModelCustomizerProvider {
OpenTelemetryConfiguration customize(OpenTelemetryConfiguration);
}
Giving implementations full ability to customize the configuration model before it is used to configure the SDK.
This issue pretends to be an umbrella to steer the implementation of the File Configuration under the Java SDK and related projects.
The file configuration of the SDK has been added to the project in this PR: https://github.com/open-telemetry/opentelemetry-java/pull/5831 This work must follow the guidelines established under the OpenTelemetry specification for the configuration of the SDK defined here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-configuration.md On the spec, it's mentioned that:
The programatic interface
The programatic interface in the Java SDK is the properties supplier defined in the
AutoConfigureOpenTelemetrySdkBuilder
asMap<String, String>> propertiesSupplier
The file configuration MUST provide a map of properties to this supplier and MUST NOT override the existing auto-configuration interfaces, namely, if a configuration file is provided, it MUST NOT take precedence over the provided programatic configurations, oTelConfigs, in this example:
Including existing signal builders, providers and customisers.
Config sources
Multiple configuration sources are defined in the spec without a sense of priorities, however, in Java it's common practice to have a hierarchy of configuration sources:
We can see that it's common practice for configurations to be sourced in many different ways and usually the same property can be set in many different sources.
Major java frameworks and cloud based systems in java assume precedence of env. vars. and sys. vars. over other configuration methods. This is a common, accepted and even expected practice.
Configuration collisions and unavailability.
If a source is not able to provide an unambiguous value for a particular configuration value, that configuration will be unavailable in that source. This behaviour must be documented and a default value must be provided. Example:
In this case it is not possible to use the env vars to configure 2 different exporters and they will end up with the same address. This must be documented. In the future, if required, support for this could be added by implementing new env. vars. It should be noted however that frameworks integrating OpenTelemetry could find a solution of their own for this problem.
Configuration source independence
According to the OTel spec, A configuration must not be exclusive of a particular configuration source, namely the file configuration.
Broad support
By principle, libraries are much better off not forcing a specific way of configuration on users, but let those decisions be driven by frameworks - this makes library usage a breeze in the frameworks that integrate with said libraries, but also allows power users to provide arbitrary configuration options if desired. Providing a file configuration alternative to "Java main()" standalone applications or the Java agent shouldn't interfere with other types of systems.
Maybe some aspects of the file configuration should be part of the Java agent and not the SDK itself.
There are related discussions on this PR: https://github.com/open-telemetry/opentelemetry-java/pull/5912 And these issues: