open-telemetry / opentelemetry-helm-charts

OpenTelemetry Helm Charts
https://opentelemetry.io
Apache License 2.0
397 stars 486 forks source link

[DEMO] Support updating environment variables instead of overwriting env-vars #438

Closed chancez closed 2 years ago

chancez commented 2 years ago

Currently it's challenging add "extra" env-vars to individual components because it overwrites their own env-vars, meaning you have to copy the existing env-var lists and put those in your own values.yaml. This is problematic because many of these env-vars are necessary for the component to even run, such as it's listen port, and it also means users doing this won't get new env-vars as the chart is updated upstream.

It was suggested in https://github.com/open-telemetry/opentelemetry-helm-charts/pull/423#discussion_r995157372 to that instead of defining env-vars in a list, we could use a map which would allow better merging behavior, instead of overwriting all env-vars, we can merge env-vars on the env-var name, making it possible to add new values or overwrite existing values per component without having to worry about copying the defaults.

This would then make it easy to override values like OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_TRACES_ENDPOINT so that users can provide their own collector endpoint.

a-thaler commented 2 years ago

I was running into the problem as well having the intention to use my custom otel-collector which forced me to re-define all the env variables. I created a PR to fix this, let me know if something else is needed to get the PR megred: https://github.com/open-telemetry/opentelemetry-helm-charts/pull/465

a-thaler commented 2 years ago

While implementation we realized in that thread https://github.com/open-telemetry/opentelemetry-helm-charts/pull/465#discussion_r1003388710 that the order of the map keys are not preserved in Helm. With that the resulting order of the env-var list can be deterministic (sorted alphanumerical) but different to the provided order. In consequence you have problems with env variables having dependencies to other variables like the OTEL_RESOURCE_ATTRIBUTES in the defaultEnv list.

There seem to be no clean solution and we either need to find a hacky solution by treating dependent envs special or we accept that variables cannot be overridden.

In order to find a decision, I like to provide my motivation of having the envs overwritable. In our project we want to provide the collector infrastructure as part of the cluster setup. Now I want to leverage the demo app for tutorials and demos by having some few instructions in place to deploy the app using the provided collector and backend. For that instructions I want to provide a pre-configured values.yaml to disable the collector and the jaeger backend in the chart and configure the components to use the provided collector. The result looks like that:

Expand me observability: otelcol: enabled: false jaeger: enabled: false components: redis: enabled: true adService: enabled: true env: - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: AD_SERVICE_PORT value: "8080" cartService: enabled: true env: - name: ASPNETCORE_URLS value: http://*:8080 - name: REDIS_ADDR value: '{{ include "otel-demo.name" . }}-redis:6379' - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: CART_SERVICE_PORT value: "8080" checkoutService: enabled: true env: - name: CART_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-cartservice:8080' - name: CURRENCY_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-currencyservice:8080' - name: PAYMENT_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-paymentservice:8080' - name: PRODUCT_CATALOG_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-productcatalogservice:8080' - name: SHIPPING_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-shippingservice:8080' - name: EMAIL_SERVICE_ADDR value: 'http://{{ include "otel-demo.name" . }}-emailservice:8080' - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: CHECKOUT_SERVICE_PORT value: "8080" currencyService: enabled: true env: - name: PORT value: "8080" - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: CURRENCY_SERVICE_PORT value: "8080" emailService: enabled: true env: - name: APP_ENV value: production - name: PORT value: "8080" - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT value: 'http://my-collector.my-system:4318/v1/traces' - name: EMAIL_SERVICE_PORT value: "8080" featureflagService: enabled: true env: - name: FEATURE_FLAG_GRPC_SERVICE_PORT value: "50053" - name: FEATURE_FLAG_SERVICE_PORT value: "8081" - name: OTEL_EXPORTER_OTLP_TRACES_PROTOCOL value: grpc - name: DATABASE_URL value: 'ecto://ffs:ffs@{{ include "otel-demo.name" . }}-ffspostgres:5432/ffs' - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' ffsPostgres: enabled: true env: - name: POSTGRES_DB value: ffs - name: POSTGRES_PASSWORD value: ffs - name: POSTGRES_USER value: ffs - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' frontend: enabled: true env: - name: FRONTEND_ADDR value: :8080 - name: AD_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-adservice:8080' - name: CART_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-cartservice:8080' - name: CHECKOUT_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-checkoutservice:8080' - name: CURRENCY_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-currencyservice:8080' - name: PRODUCT_CATALOG_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-productcatalogservice:8080' - name: RECOMMENDATION_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-recommendationservice:8080' - name: SHIPPING_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-shippingservice:8080' - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: FRONTEND_PORT value: "8080" loadgenerator: enabled: true env: - name: FRONTEND_ADDR value: '{{ include "otel-demo.name" . }}-frontend:8080' - name: LOCUST_WEB_PORT value: "8089" - name: LOCUST_USERS value: "10" - name: LOCUST_HOST value: "http://$(FRONTEND_ADDR)" - name: LOCUST_HEADLESS value: "false" - name: LOCUST_AUTOSTART value: "true" - name: PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION value: python - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: LOADGENERATOR_PORT value: "8089" paymentService: enabled: true env: - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: PAYMENT_SERVICE_PORT value: "8080" productCatalogService: enabled: true env: - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: PRODUCT_CATALOG_SERVICE_PORT value: "8080" - name: FEATURE_FLAG_GRPC_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-featureflagservice:50053' recommendationService: enabled: true env: - name: OTEL_PYTHON_LOG_CORRELATION value: "true" - name: PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION value: python - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: FEATURE_FLAG_GRPC_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-featureflagservice:50053' - name: RECOMMENDATION_SERVICE_PORT value: "8080" - name: PRODUCT_CATALOG_SERVICE_ADDR value: '{{ include "otel-demo.name" . }}-productcatalogservice:8080' shippingService: enabled: true env: - name: PORT value: "8080" - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'http://my-collector.my-system:4317' - name: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT value: 'http://my-collector.my-system:4317' - name: SHIPPING_SERVICE_PORT value: "8080" - name: QUOTE_SERVICE_ADDR value: 'http://{{ include "otel-demo.name" . }}-quoteservice:8080' quoteService: enabled: true env: - name: OTEL_TRACES_SAMPLER value: "parentbased_always_on" - name: OTEL_TRACES_EXPORTER value: "otlp" - name: OTEL_EXPORTER_OTLP_TRACES_PROTOCOL value: "grpc" - name: OTEL_PHP_TRACES_PROCESSOR value: "simple" - name: OTEL_EXPORTER_OTLP_ENDPOINT value: 'my-collector.my-system:4317' - name: QUOTE_SERVICE_PORT value: "8080"

The values.yaml should be as minimal as possible as I need to maintain it, so it should contain only mandatory customizations. The problem now is that I need to maintain ca 200 lines of yaml which are mainly copy/pasted and I need to verify them on every update of the chart vrsion.

TylerHelmuth commented 2 years ago

Since most of the duplication is coming from components maybe we can get away with only component env var supporting map instead of list? It will still have the problem of not allowing env vars to depend on each other, but at least we don't have any use cases for that right now like default env var does.

TylerHelmuth commented 2 years ago

/cc @dmitryax @Allex1

a-thaler commented 2 years ago

That solution should work out but introduces an inconsistency in how to define the two different layers of env variables. I'm fine with that as I also don't see a better solution. Please let me know when/if this can be seen as agreed so that I can start adjusting my PR.

TylerHelmuth commented 2 years ago

@dmitryax @Allex1 curious to hear your thoughts.

Allex1 commented 2 years ago

@TylerHelmuth sorry for the delay . TBH even though it seems cumbersome(and I get it's not ideal to maintain a 200 line yaml values file), I actually prefer providing the env vars as an array instead of a dict as that is the way they end up on the pod themselves and the way Kubernetes docs refer to env vars in general. To me it seems that the complexity comes from the way the multiple components of the stack have been written in the first place. Maybe we could rewrite some of the services to use cli arguments instead of env vars, or provide each with a configMap where we can set these values.

a-thaler commented 2 years ago

@Allex1 @TylerHelmuth My understanding is that it is recommended to configure the otel-sdk from outside via env variables to have a consistent approach across the used languages, see https://opentelemetry.io/docs/reference/specification/sdk-environment-variables. So switching to args sounds not like a recommended approach and with that is bad for a demo app. It is more a flaw of the used package manager which does not support a proper way of overriding arrays.

Unfortunately I don't see any use case where you want to replace all env variables of a component, you always want to change a particular settings or add new ones. And that will not work with Helm.

However, maybe we should go back to the initial discussion of https://github.com/open-telemetry/opentelemetry-helm-charts/pull/423 and think about introducing a dedicated setting just for the collector. So having a template definition which knows the default value for the collector domain and allows to override it via a dedicated setting, but do it only for the basic domain of the collector, not protocol specific. That template is then used in the envs to build the collector URL.

Something like:

observability:
  otelcol:
    enabled: false
    baseDomain: "" # allow to override the definition of newTemplate

components:
  adService:
    env:
      - name: OTEL_EXPORTER_OTLP_ENDPOINT
        value: 'http://{{ include "otel-demo.newTemplate" . }}:4317'

The downside is, that it will not configure the actual baseDomain/service for the collector if the collector is actually enabled. The service of the collector is using the .fullname template and with that cannot really be influenced from the main chart via a dedicated setting. So probably it will be better to not have it as part of the otelcol section but on root level as already suggested in the linked PR.

TylerHelmuth commented 2 years ago

I need to review each component, but could OTEL_EXPORTER_OTLP_ENDPOINT be made a default env var? That way to override it you'd only need to modify the default env var list instead of each component's.

a-thaler commented 2 years ago

That will improve the situation for sure. Only the redis component is not using the env variable, so applying it by default to all will not harm. However, there are some few component (I think 2 at the moment) which additionally specify the trace endpoint (OTEL_EXPORTER_OTLP_TRACES_ENDPOINT). I can imagine that there will be more added in future for example for metrics to showcase how to configure them individual and using the different protocols. They should use by default all the same base domain and again you would need to start to override the envs per component.

chancez commented 2 years ago

but could OTEL_EXPORTER_OTLP_ENDPOINT be made a default env var

Not entirely, not every component uses the same protocol (gRPC vs HTTP), and the ports for these vary, so the same variable needs potentially different values. My PR mentions this.

dmitryax commented 2 years ago

Another possible option is to keep the list but allow overriding items with some artificial keys, for example components.adService.env/OTEL_EXPORTER_OTLP_ENDPOINT (or env_OTEL_EXPORTER_OTLP_ENDPOINT) to override OTEL_EXPORTER_OTLP_ENDPOINT in adService. We can introduce some helm helpers for that. We should also fail if user provides an override for not existing env var.

TylerHelmuth commented 2 years ago

@dmitryax I'm not quite following. Are you saying we allow setting a values like --set components.adService.env/OTEL_EXPORTER_OTLP_ENDPOINT=some-value and then the helm chart knows to apply that to the adservice's env list?

dmitryax commented 2 years ago

Correct, but you still need value or valueFrom. e.g.:

components:
  adService:
    env/OTEL_EXPORTER_OTLP_ENDPOINT:
      value: 'http://{{ include "otel-demo.name" . }}-otelcol:4317'
    env/OTEL_K8S_POD_NAME:
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
dmitryax commented 2 years ago

We should also fail if user provides an override for not existing env var.

Actually instead of failing we can append new env var to the list

dmitryax commented 2 years ago

If that idea seems too hacky, another option is to have a map with keys that are used only for preserving the order and not being used in the templating at all. This one is explicit and should be easy to use:

default:
  env:
    1_OTEL_SERVICE_NAME:
      name: OTEL_SERVICE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: "metadata.labels['app.kubernetes.io/component']"
    2_OTEL_K8S_NAMESPACE:
      name: OTEL_K8S_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    3_OTEL_K8S_NODE_NAME:
      name: OTEL_K8S_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    4_OTEL_K8S_POD_NAME:
      name: OTEL_K8S_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    5_OTEL_RESOURCE_ATTRIBUTES:
      name: OTEL_RESOURCE_ATTRIBUTES
      value: service.name=$(OTEL_SERVICE_NAME),k8s.namespace.name=$(OTEL_K8S_NAMESPACE),k8s.node.name=$(OTEL_K8S_NODE_NAME),k8s.pod.name=$(OTEL_K8S_POD_NAME)
TylerHelmuth commented 2 years ago

@dmitryax if we allow something like .Values.components.adService.env/OTEL_EXPORTER_OTLP_ENDPOINT then I would opt instead for the solution @chancez proposes in https://github.com/open-telemetry/opentelemetry-helm-charts/pull/423. Both require an extra field specifically for setting the collector endpoint.

I think the numeric ordering of default env var keys would also work (although its hacky as well).

dmitryax commented 2 years ago

The .Values.components.adService.env/OTEL_EXPORTER_OTLP_ENDPOINT approach does not introduce any extra fields in the default values.yaml, it'll stay exactly the same as it is right now. We will just support overriding the env items with these artificial fields if users supply them in their custom values.yamls.

dmitryax commented 2 years ago

If we are confident that OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_TRACES_ENDPOINT are the only env vars that user would need to override, then I'm good with applying https://github.com/open-telemetry/opentelemetry-helm-charts/pull/423

dmitryax commented 2 years ago

Also instead of those compound env/... keys we can have envOverride group which is empty by default. Custom user config would be something like this if they want to change OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_K8S_POD_NAME:

components:
  adService:
    envOverride:
      OTEL_EXPORTER_OTLP_ENDPOINT:
        value: 'http://{{ include "otel-demo.name" . }}-otelcol:4317'
      OTEL_K8S_POD_NAME:
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.name
TylerHelmuth commented 2 years ago

@dmitryax Ok I see what you mean about supporting overriding values. If we do a generic solution (which I like bc with the way the demo changes I doubt the collector would be the only value to change in the future), how are you thinking we adjust the service's env list? A helper that loops through the list and replaces matching name fields?

dmitryax commented 2 years ago

A helper that loops through the list and replaces matching name fields?

Yes, and appends an env var in case of mismatch.

We can also support null for removals

TylerHelmuth commented 2 years ago

@chancez @a-thaler is this idea something either of you have time to work on?

a-thaler commented 2 years ago

Sounds like a good idea and I like to work on that this week. To be aligned here, I imagine such final values.yaml.

default:
  # list of environment variables applied to all components
  env:
    - name: OTEL_SERVICE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: "metadata.labels['app.kubernetes.io/component']"
   - ...
   # list of environment variables which get applied to all components additional to the list provided with the "env" section.
   # Use this section to override or add environment variables selectively 
   envOverrides: []
   # - name: OTEL_SERVICE_NAME
   #   value: "myValue"

components:
  adService:
    # list of environment variables applied to the adservice component, additional to the default.envs
    env:
       - name: OTEL_EXPORTER_OTLP_ENDPOINT:
          value: 'http://{{ include "otel-demo.name" . }}-otelcol:4317'

    # list of environment variables which get applied to the adservice component additional to the list provided with the "env" section.
    # Use this section to override or add environment variables selectively 
    envOverride: []
    # - name: OTEL_EXPORTER_OTLP_ENDPOINT:
    #    value: 'http://myCustomDomain:4317'

I can try to support overriding a default.env in a components envOverride as well, as outlined in the example of @dmitryax, so overridding the OTEL_SERVICE_NAME in the component, let's see

chancez commented 2 years ago

@TylerHelmuth I don't have time atm, but someone else can pick it up

dmitryax commented 2 years ago

@a-thaler sounds good. Thanks! I was thinking of envOverrides to be a map, but list probably is a better option, just in case user wants additional env vars to preserve their order

chancez commented 2 years ago

Also to add: I think the approach suggested by @a-thaler would for for me and, using a list for the override makes sense to me.

a-thaler commented 2 years ago

Created PR https://github.com/open-telemetry/opentelemetry-helm-charts/pull/484 which implements the discussed proposal. As outlined in the PR description I skipped support for overriding default envs in the component specific envOverride section as I found no clean solution. Please have a look and also check if you have a better idea on how to implement the actual merge logic while preserving the order.

dmitryax commented 2 years ago

The issue is resolved in #484