fluxcd / flagger

Progressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments)
https://docs.flagger.app
Apache License 2.0
4.9k stars 732 forks source link

Canary Release with Session Affinity doesn't work #1339

Closed AnatoliiNefedov closed 1 year ago

AnatoliiNefedov commented 1 year ago

With Session Affinity the client is not tied up to the canary release.

In the resource canary in .spec.analysis we set

sessionAffinity:
  cookieName: test-flagger-cookie

During deployment, we observe that the cookie is set not on the domain, but on the path. The client who received this cookie is not tied to a specific version of the service that is deployed at the moment.

image

We also noticed that cookie max-age=-1 is not set after the canary deploy is completed

We expected that the client would be tied to one of the versions of the service using the cookie during the canary deployment, as it is written in the documentation:

This means once a user is exposed to the new version of our application (based on the traffic weights), they're always routed to that version, i.e. they're never routed back to the old version of our application.

Unfortunately, when the page is refreshed, the client sees both the old and the new versions of the service.

Additional context

aryan9600 commented 1 year ago

Could you please post your entire canary definition?

AnatoliiNefedov commented 1 year ago
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: test-istio
  namespace: stages
spec:
  # deployment reference
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-backend
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 60
  # HPA reference (optional)
  autoscalerRef:
    apiVersion: autoscaling/v2
    kind: HorizontalPodAutoscaler
    name: test-backend
  service:
    # service port number
    port: 80
    targetPort: nginx
    # Istio gateways (optional)
    portDiscovery: true
    match:
      - uri:
          prefix: /
    gateways:
    - gateway/gateway
    # Istio virtual service host names (optional)
    hosts:
    - '*.test-istio.istio.example.com'
    - .test-istio.istio.example.com
    - test-backend.stages.svc.cluster.local
    # Istio traffic policy (optional)
    trafficPolicy:
      tls:
        # use ISTIO_MUTUAL when mTLS is enabled
        mode: ISTIO_MUTUAL
    # Istio retry policy (optional)
    retries:
      attempts: 3
      perTryTimeout: 100s
      retryOn: "gateway-error,connect-failure,refused-stream"
  analysis:
    # schedule interval (default 60s)
    interval: 1m
    # max number of failed metric checks before rollback
    threshold: 3
    # max traffic percentage routed to canary
    # percentage (0-100)
    maxWeight: 50
    # canary increment step
    # percentage (0-100)
    stepWeight: 10
    sessionAffinity:
      cookieName: test-flagger-cookie
      maxAge: 720
    metrics:
    - name: istio_requests_total
      # minimum req success rate (non 5xx responses)
      # percentage (0-100)
      thresholdRange:
        min: 99
      interval: 1m
    - name: istio_request_duration_seconds_bucket
      # maximum req duration P99
      # milliseconds
      thresholdRange:
        max: 500
      interval: 30s
aryan9600 commented 1 year ago

hey, thank you for opening this issue. upon some testing, it seems like this issue occurs when there are multiple cookies stored. i have worked on a fix for this issue and i have confirmed it works via some manual testing. this is the image: ghcr.io/fluxcd/flagger:rc-1f3d1fd6. could you please deploy this image and confirm if it works for you as well? thanks

AnatoliiNefedov commented 1 year ago

Thanks! It works now!

And after good deployment cookies are cleared.