fluxcd / flagger

Progressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments)
https://docs.flagger.app
Apache License 2.0
4.91k stars 735 forks source link

Cilium integration #1427

Open mrmartan opened 1 year ago

mrmartan commented 1 year ago

Expand the list of supported network integrations/services meshes/ingresses by Cilium. https://cilium.io/

Cilium is the eBPF powered networking solution for Kubernetes. It is capable of replacing all of CNI, kube-proxy and ingress controller.

aryan9600 commented 1 year ago

It looks like Cilium supports Gateway API: https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/gateway-api/ If their implementation is fully conformant to the Gateway API spec, it should work fine with Flagger

cotocisternas commented 1 year ago

If it helps, here are my MetricTemplates for Cilium Gateway API

templates

---
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
  name: error-rate
  namespace: flagger-system
spec:
  provider:
    type: prometheus
    address: http://prometheus:9090
  query: |

    100 - sum(
      rate(
        hubble_http_requests_total{
          destination_namespace=~"{{ namespace }}",
          destination_workload=~"{{ target }}",
          status!~"5.*"
        }[{{ interval }}]
      )
    )
    /
    sum(
      rate(
        hubble_http_requests_total{
          destination_namespace=~"{{ namespace }}",
          destination_workload=~"{{ target }}",
        }[{{ interval }}]
      )
    )
    * 100
---
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
  name: latency
  namespace: flagger-system
spec:
  provider:
    type: prometheus
    address: http://prometheus:9090
  query: |
    histogram_quantile(0.99,
      sum(
        rate(
            hubble_http_request_duration_seconds_bucket{
                destination_namespace=~"{{ namespace }}",
                destination_workload=~"{{ target }}",
            }[{{ interval }}]
        )
      ) by (le)
    )

canary metrics

    metrics:
      - name: error-rate
        templateRef:
          name: error-rate
          namespace: flagger-system
        thresholdRange:
          max: 1
        interval: 1m
      - name: latency
        templateRef:
          name: latency
          namespace: flagger-system
        thresholdRange:
          max: 0.5
        interval: 30s
project-administrator commented 2 months ago

This should be included in the official documentation. I mean, I could not have set this up myself without these sample MetricTemplates.