keptn / lifecycle-toolkit

Toolkit for cloud-native application lifecycle management
https://keptn.sh
Apache License 2.0
272 stars 111 forks source link

Research possible integration with Flagger #3050

Open odubajDT opened 5 months ago

odubajDT commented 5 months ago

Goal

Make a research and come up with a strategy how it would be possible to integrate Keptn with Flagger. Look at the possibility of using Keptn metrics for release analysis. As well look at the possibility of using Keptn observability monitoring of progressive delivery.

Questions for the research

If possible, let's create a small PoC as part of this research (in the points where it makes sense).

bacherfl commented 3 months ago

https://github.com/keptn/lifecycle-toolkit/pull/3371 contains a simple PoC

The PoC shows a integration of Keptn Metrics into a Flagger Canary. In this example, we are making use of the Prometheus endpoint provided by Keptn (i.e. the metrics-operator), which serves the values of all KeptnMetrics.

This way, we are able to use a Flagger MetricTemplate of type prometheus, which retrieves the value from a Prometheus instance that has access to the KeptnMetrics.

The example is based on the Istio Canary Deployments tutorial provided in the Flagger docs.

The difference to the tutorial is that instead of using the request-duration duration provided by Istio via Prometheus, we are referring to a KeptnMetric called response-time. The Flagger metrics provider is in this case still prometheus.

What could be an interesting idea would be to contribute to Flagger by adding a keptn metrics provider to their provider implementations. This would also open up the possibility to use Keptn Analyses in Flagger, which might be a valuable addition that benefits both projects.

In terms of observability, we do get the OpenTelemetry traces generated by Keptn out of the box if the relevant annotations are present in the deployment managed by Flagger.

The addition of pre-/post-deployment tasks using Keptn is also possible, but here Flagger provides a similar concept via Webhooks, which are naturally more tailored to Flagger as they also allow to do intermediate checks after the pods for the canary deployment have been started, e.g. to decide if more traffic should be sent to the canary. This is something Keptn does not provide, as we operate on pre-/post-deployment of the deployment, but are not aware of the canary increments of Flagger.

mowies commented 3 months ago

contributing a keptn provider implementation to Flagger makes a lot of sense, great research!