open-telemetry / opentelemetry-demo

This repository contains the OpenTelemetry Astronomy Shop, a microservice-based distributed system intended to illustrate the implementation of OpenTelemetry in a near real-world environment.
https://opentelemetry.io/docs/demo/
Apache License 2.0
1.73k stars 1.1k forks source link

Toggle Feature Flags via Helm #1304

Closed mreider closed 7 months ago

mreider commented 9 months ago

Feature Request

Problem

OpenTelemetry is often used for performance benchmarking (comparing the performance of two deployments). Unfortunately the OpenTelemetry Demo cannot demonstrate this without adding the feature flag service to the list of components that can be modified via Chart parameters.

https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-demo#component-parameters

Details

Teaching someone about contemporary / cloud-native performance benchmarking begins with a discussion of GitOps and CI/CD. Whether this is ArgoCD, Flux, Jenkins, Github Actions, etc., does not matter. What matters is that one deployment behaves one way, and another deployment behaves another. These concepts are complicated enough. A learner should not be expected to jump into the feature flag UI to trigger a failure scenario and imagine this represents a new deployment. The deployment itself should be observable.

The first step is exposing feature flags in the helm file. Nothing more sophisticated can happen before that single change. Beyond performance benchmarking, this change allows us to consider a handful of other scenarios like canary deployments, rolling updates, A/B Testing, dark launches, traffic shadowing, auto-scaling, etc.

Potential Solution

Pretty simple really. The Feature Flag service would be available as a component in the helm chart, and we can toggle features on / off easily. Of course new feature flags (that a user added via UI, and are not in Git) could also be toggled - i.e. the names of the feature flags are not hardcoded. Would be nice, also, if there was some error handling on feature flags that don't exist, but this is just a nice-to-have.

Alternatives I've considered

I looked at the source code for a bit, thinking I might submit a pull request, but elixir is not in my comfort zone. I also thought about writing a script that execs a postgres update in the feature flag container, but this is really hacky.

austinlparker commented 9 months ago

Since the feature flag service is available through an API, could you have the flags be added/modified by calling those routes through curl? That was the general idea we had when building it initially.

mreider commented 9 months ago

Absolutely, what are the API endpoints? I didn't see them documented, unless I'm missing something.

puckpuck commented 9 months ago

We defined the API spec for this. It's in the demo.proto, but I'm confident the FeatureFlag service does not implement the Update/Delete API calls. Only GetFlag is implemented in the service, and all updates/deletes are done via Elixir Phoenix for the UI and Ecto to do the DB stuff.

I'm also wondering if doing this via an API call is the right approach because you need to make the API call only after the FeatureFlag service starts successfully, which can get tricky in a K8s world. I do like the idea of externalizing this out of the FeatureFlag service itself. Right now, all OOTB flags are hardcoded. Having this done in a config file loaded on startup would be much nicer. This config file can then be altered pre-deployment through the Helm chart or a file on the local disk for docker.

mreider commented 9 months ago

Yes, I saw in the source code that you can only get the flags, not update them. If you can add this to the API that's likely easier than adding configs to the helm chart, but yes, ultimately, the helm configs are the better solution.

austinlparker commented 9 months ago

What about using OpenFeature? (https://openfeature.dev/docs/reference/intro)

It has an operator, so we could use k8s resources as feature flags... would just need to make the FF service a provider, and switch out the feature flag library w/OpenFeature

julianocosta89 commented 9 months ago

What about using OpenFeature? (https://openfeature.dev/docs/reference/intro)

It has an operator, so we could use k8s resources as feature flags... would just need to make the FF service a provider, and switch out the feature flag library w/OpenFeature

+1

luizgribeiro commented 9 months ago

2 cents on this...

Open Feature would be the feature flag backend agnostic SDK (pretty similar to OTel idea) that is used on each service that needs feature flags resolutions. It would still be necessary to have a feature flag provider/service running to attach it to, but it would definetly do the job and also provide aditional info about feature flag evaluation, context conditional flag evaluation and the possibility to configure default feature flag values in case the service is not working.

Open Feature also have the Flagd project, which is a provider that can be used for providing flags values. This one has the operator and can define flag values at deployment level.

puckpuck commented 9 months ago

Though we should move to an OpenFeature framework for this, that lift is much larger than the valid ask by this issue. Let's take up that discussion separately to go over all the ramifications and various changes that need to be done to move to OpenFeature.

1319 resolves this issue, which is to allow the user to toggle feature flags via Helm. I moved Feature Flag initialization to an SQL script loaded by the Postgres database instead of in code using Ecto migrations within the service itself. In the process, I also simplified much of the Feature Flag data structure, removing the sequence ID and timestamp fields. This made the SQL script to create the feature flags much simpler and friendlier to edit.

The SQL scripts are mounted to the container at runtime to be user-modified before starting the demo.

I still need to create a Helm chart PR to enable this so we can close this issue.

mreider commented 7 months ago

Thanks @puckpuck for adding the SQL script(s). Curious if you plan on making a Helm PR and if you'd like my help in adding this to the docs. 🙏

puckpuck commented 7 months ago

I have a draft PR already for Helm to do this. It does require a 1.8.x release to be built and published, which I suspect we are ready to do now.

austinlparker commented 7 months ago

We'll be able to support this via the OpenFeature operator in 1.9+, see #1388

puckpuck commented 7 months ago

When the helm chart for Demo 1.8.0 PR is merge this will be supported there as well.

puckpuck commented 7 months ago

We have a Helm chart release that addresses this.