Closed chancez closed 2 years ago
I was running into the problem as well having the intention to use my custom otel-collector which forced me to re-define all the env variables. I created a PR to fix this, let me know if something else is needed to get the PR megred: https://github.com/open-telemetry/opentelemetry-helm-charts/pull/465
While implementation we realized in that thread https://github.com/open-telemetry/opentelemetry-helm-charts/pull/465#discussion_r1003388710 that the order of the map keys are not preserved in Helm. With that the resulting order of the env-var list can be deterministic (sorted alphanumerical) but different to the provided order. In consequence you have problems with env variables having dependencies to other variables like the OTEL_RESOURCE_ATTRIBUTES in the defaultEnv list.
There seem to be no clean solution and we either need to find a hacky solution by treating dependent envs special or we accept that variables cannot be overridden.
In order to find a decision, I like to provide my motivation of having the envs overwritable. In our project we want to provide the collector infrastructure as part of the cluster setup. Now I want to leverage the demo app for tutorials and demos by having some few instructions in place to deploy the app using the provided collector and backend. For that instructions I want to provide a pre-configured values.yaml to disable the collector and the jaeger backend in the chart and configure the components to use the provided collector. The result looks like that:
The values.yaml should be as minimal as possible as I need to maintain it, so it should contain only mandatory customizations. The problem now is that I need to maintain ca 200 lines of yaml which are mainly copy/pasted and I need to verify them on every update of the chart vrsion.
Since most of the duplication is coming from components maybe we can get away with only component env var supporting map instead of list? It will still have the problem of not allowing env vars to depend on each other, but at least we don't have any use cases for that right now like default env var does.
/cc @dmitryax @Allex1
That solution should work out but introduces an inconsistency in how to define the two different layers of env variables. I'm fine with that as I also don't see a better solution. Please let me know when/if this can be seen as agreed so that I can start adjusting my PR.
@dmitryax @Allex1 curious to hear your thoughts.
@TylerHelmuth sorry for the delay . TBH even though it seems cumbersome(and I get it's not ideal to maintain a 200 line yaml values file), I actually prefer providing the env vars as an array instead of a dict as that is the way they end up on the pod themselves and the way Kubernetes docs refer to env vars in general. To me it seems that the complexity comes from the way the multiple components of the stack have been written in the first place. Maybe we could rewrite some of the services to use cli arguments instead of env vars, or provide each with a configMap where we can set these values.
@Allex1 @TylerHelmuth My understanding is that it is recommended to configure the otel-sdk from outside via env variables to have a consistent approach across the used languages, see https://opentelemetry.io/docs/reference/specification/sdk-environment-variables. So switching to args sounds not like a recommended approach and with that is bad for a demo app. It is more a flaw of the used package manager which does not support a proper way of overriding arrays.
Unfortunately I don't see any use case where you want to replace all env variables of a component, you always want to change a particular settings or add new ones. And that will not work with Helm.
However, maybe we should go back to the initial discussion of https://github.com/open-telemetry/opentelemetry-helm-charts/pull/423 and think about introducing a dedicated setting just for the collector. So having a template definition which knows the default value for the collector domain and allows to override it via a dedicated setting, but do it only for the basic domain of the collector, not protocol specific. That template is then used in the envs to build the collector URL.
Something like:
observability:
otelcol:
enabled: false
baseDomain: "" # allow to override the definition of newTemplate
components:
adService:
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: 'http://{{ include "otel-demo.newTemplate" . }}:4317'
The downside is, that it will not configure the actual baseDomain/service for the collector if the collector is actually enabled. The service of the collector is using the .fullname template and with that cannot really be influenced from the main chart via a dedicated setting. So probably it will be better to not have it as part of the otelcol section but on root level as already suggested in the linked PR.
I need to review each component, but could OTEL_EXPORTER_OTLP_ENDPOINT be made a default env var? That way to override it you'd only need to modify the default env var list instead of each component's.
That will improve the situation for sure. Only the redis component is not using the env variable, so applying it by default to all will not harm. However, there are some few component (I think 2 at the moment) which additionally specify the trace endpoint (OTEL_EXPORTER_OTLP_TRACES_ENDPOINT). I can imagine that there will be more added in future for example for metrics to showcase how to configure them individual and using the different protocols. They should use by default all the same base domain and again you would need to start to override the envs per component.
but could OTEL_EXPORTER_OTLP_ENDPOINT be made a default env var
Not entirely, not every component uses the same protocol (gRPC vs HTTP), and the ports for these vary, so the same variable needs potentially different values. My PR mentions this.
Another possible option is to keep the list but allow overriding items with some artificial keys, for example
components.adService.env/OTEL_EXPORTER_OTLP_ENDPOINT
(or env_OTEL_EXPORTER_OTLP_ENDPOINT
) to override OTEL_EXPORTER_OTLP_ENDPOINT
in adService. We can introduce some helm helpers for that. We should also fail if user provides an override for not existing env var.
@dmitryax I'm not quite following. Are you saying we allow setting a values like --set components.adService.env/OTEL_EXPORTER_OTLP_ENDPOINT=some-value
and then the helm chart knows to apply that to the adservice's env list?
Correct, but you still need value
or valueFrom
. e.g.:
components:
adService:
env/OTEL_EXPORTER_OTLP_ENDPOINT:
value: 'http://{{ include "otel-demo.name" . }}-otelcol:4317'
env/OTEL_K8S_POD_NAME:
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
We should also fail if user provides an override for not existing env var.
Actually instead of failing we can append new env var to the list
If that idea seems too hacky, another option is to have a map with keys that are used only for preserving the order and not being used in the templating at all. This one is explicit and should be easy to use:
default:
env:
1_OTEL_SERVICE_NAME:
name: OTEL_SERVICE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: "metadata.labels['app.kubernetes.io/component']"
2_OTEL_K8S_NAMESPACE:
name: OTEL_K8S_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
3_OTEL_K8S_NODE_NAME:
name: OTEL_K8S_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
4_OTEL_K8S_POD_NAME:
name: OTEL_K8S_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
5_OTEL_RESOURCE_ATTRIBUTES:
name: OTEL_RESOURCE_ATTRIBUTES
value: service.name=$(OTEL_SERVICE_NAME),k8s.namespace.name=$(OTEL_K8S_NAMESPACE),k8s.node.name=$(OTEL_K8S_NODE_NAME),k8s.pod.name=$(OTEL_K8S_POD_NAME)
@dmitryax if we allow something like .Values.components.adService.env/OTEL_EXPORTER_OTLP_ENDPOINT
then I would opt instead for the solution @chancez proposes in https://github.com/open-telemetry/opentelemetry-helm-charts/pull/423. Both require an extra field specifically for setting the collector endpoint.
I think the numeric ordering of default env var keys would also work (although its hacky as well).
The .Values.components.adService.env/OTEL_EXPORTER_OTLP_ENDPOINT
approach does not introduce any extra fields in the default values.yaml
, it'll stay exactly the same as it is right now. We will just support overriding the env
items with these artificial fields if users supply them in their custom values.yamls.
If we are confident that OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_TRACES_ENDPOINT are the only env vars that user would need to override, then I'm good with applying https://github.com/open-telemetry/opentelemetry-helm-charts/pull/423
Also instead of those compound env/...
keys we can have envOverride
group which is empty by default. Custom user config would be something like this if they want to change OTEL_EXPORTER_OTLP_ENDPOINT
and OTEL_K8S_POD_NAME
:
components:
adService:
envOverride:
OTEL_EXPORTER_OTLP_ENDPOINT:
value: 'http://{{ include "otel-demo.name" . }}-otelcol:4317'
OTEL_K8S_POD_NAME:
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
@dmitryax Ok I see what you mean about supporting overriding values. If we do a generic solution (which I like bc with the way the demo changes I doubt the collector would be the only value to change in the future), how are you thinking we adjust the service's env
list? A helper that loops through the list and replaces matching name
fields?
A helper that loops through the list and replaces matching name fields?
Yes, and appends an env var in case of mismatch.
We can also support null
for removals
@chancez @a-thaler is this idea something either of you have time to work on?
Sounds like a good idea and I like to work on that this week. To be aligned here, I imagine such final values.yaml.
default:
# list of environment variables applied to all components
env:
- name: OTEL_SERVICE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: "metadata.labels['app.kubernetes.io/component']"
- ...
# list of environment variables which get applied to all components additional to the list provided with the "env" section.
# Use this section to override or add environment variables selectively
envOverrides: []
# - name: OTEL_SERVICE_NAME
# value: "myValue"
components:
adService:
# list of environment variables applied to the adservice component, additional to the default.envs
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT:
value: 'http://{{ include "otel-demo.name" . }}-otelcol:4317'
# list of environment variables which get applied to the adservice component additional to the list provided with the "env" section.
# Use this section to override or add environment variables selectively
envOverride: []
# - name: OTEL_EXPORTER_OTLP_ENDPOINT:
# value: 'http://myCustomDomain:4317'
I can try to support overriding a default.env in a components envOverride as well, as outlined in the example of @dmitryax, so overridding the OTEL_SERVICE_NAME in the component, let's see
@TylerHelmuth I don't have time atm, but someone else can pick it up
@a-thaler sounds good. Thanks! I was thinking of envOverrides
to be a map, but list probably is a better option, just in case user wants additional env vars to preserve their order
Also to add: I think the approach suggested by @a-thaler would for for me and, using a list for the override makes sense to me.
Created PR https://github.com/open-telemetry/opentelemetry-helm-charts/pull/484 which implements the discussed proposal. As outlined in the PR description I skipped support for overriding default envs in the component specific envOverride section as I found no clean solution. Please have a look and also check if you have a better idea on how to implement the actual merge logic while preserving the order.
The issue is resolved in #484
Currently it's challenging add "extra" env-vars to individual components because it overwrites their own env-vars, meaning you have to copy the existing env-var lists and put those in your own values.yaml. This is problematic because many of these env-vars are necessary for the component to even run, such as it's listen port, and it also means users doing this won't get new env-vars as the chart is updated upstream.
It was suggested in https://github.com/open-telemetry/opentelemetry-helm-charts/pull/423#discussion_r995157372 to that instead of defining env-vars in a list, we could use a map which would allow better merging behavior, instead of overwriting all env-vars, we can merge env-vars on the env-var name, making it possible to add new values or overwrite existing values per component without having to worry about copying the defaults.
This would then make it easy to override values like OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_TRACES_ENDPOINT so that users can provide their own collector endpoint.