fluxcd / flagger

Progressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments)
https://docs.flagger.app
Apache License 2.0
4.79k stars 716 forks source link

How do I setup session affinity? #1198

Closed Apollorion closed 1 year ago

Apollorion commented 2 years ago

Reading here it makes me think that I can set up flagger in a way that a user has the ability to initially be routed to either the primary or the canary. Then once they are on one or the other I could set a cookie that forces the rest of the requests to always go to the same target.

I've found thats not the case. When using flagger for A/B testing with a cookie, if the cookie is present then the frontend user is routed to the weight based system that has the ability to be sent to either the primary or the canary and if the cookie is not present then they are only routed to the primary.

Is it possible to get what Im looking for with flagger? I essentially want to ensure if the first request makes it to the primary then all subsequent requests go to the primary. The same goes for the canary, if the first request makes it to the canary (based off weights) then I want the rest of the requests to go to the canary. I do have a system in place now that will set a cookie after the first request telling me what version I want in subsequent requests, but I cant seem to get flagger to look at that cookie and route to where I want it to go.

kingdonb commented 2 years ago

I think the functionality you're describing for A/B testing is basically what's done with the frontend canary on the gitops-istio tutorial.

The example shows selecting based on browser type and a cookie variable ("insider") but it should be clear how to change this to use either a cookie variable, or some other aspect like browser type:

https://github.com/stefanprodan/gitops-istio/blob/adecf6a4cfbae30f300be1ada76b68dace459970/apps/frontend/canary.yaml#L37-L45

It's not clear to me how someone gets this 'insider' cookie, if the canary is the "B" test and everyone who does not have the cookie goes to the "A" primary, then you have to have some way of randomly setting the cookie on some fraction of people who have never visited the primary or canary before. That seems to be a small thing missing from the frontend example here with istio in order to make it work exactly how you wanted. (If there's an easy way to do that, I'm not sure how.)

aryan9600 commented 2 years ago

@Apollorion Ideally, the weights for the primary is 0 and 100 for the canary for the traffic rule which matches against the header. Could you share the definition of the Istio VirtualService or whichever CRD that defines traffic routing rules for your provider?

Apollorion commented 2 years ago

@aryan9600

Here is the Canary:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: session-affinity
  namespace: test
  analysis:
    interval: 1m
    match:
    - headers:
        cookie:
          exact: x-app-version=86975ff
    maxWeight: 50
    metrics:
    - interval: 1m
      name: 5xx percentage
      templateRef:
        name: flagger-metric-templates-5xx
        namespace: istio-system
      thresholdRange:
        max: 1
    stepWeight: 10
    threshold: 5
    webhooks:
    - metadata:
        cmd: hey -z 1m -q 10 -c 2 http://session-affinity-canary.test.svc.cluster.local:80
        type: cmd
      name: Simple Loadtest
      timeout: 5s
      type: rollout
      url: http://flagger-loadtester.istio-system.svc.cluster.local/
  service:
    gateways:
    - istio-gateways/test-gateway
    hosts:
    - session-affinity.test.apollorion.com
    port: 80
    retries:
      attempts: 3
      perTryTimeout: 1s
      retryOn: gateway-error,connect-failure,refused-stream
    targetPort: http
    trafficPolicy:
      tls:
        mode: DISABLE
  skipAnalysis: false
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: session-affinity

Here is the Virtual Service that gets generated by the canary (with no analysis):

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: session-affinity
  namespace: test
spec:
  gateways:
  - istio-gateways/test-gateway
  hosts:
  - session-affinity.test.apollorion.com
  http:
  - match:
    - headers:
        cookie:
          exact: x-app-version=86975ff
    retries:
      attempts: 3
      perTryTimeout: 1s
      retryOn: gateway-error,connect-failure,refused-stream
    route:
    - destination:
        host: session-affinity-primary
      weight: 100
    - destination:
        host: session-affinity-canary
      weight: 0
  - retries:
      attempts: 3
      perTryTimeout: 1s
      retryOn: gateway-error,connect-failure,refused-stream
    route:
    - destination:
        host: session-affinity-primary
      weight: 100

I do see now that Im still defining stepWeight: 10 which is probably why the canary/primary didn't flip to 100% like you mentioned before. But even then, this still doesn't give me what Im looking for.

@kingdonb Is more closely describing what Im looking to do. Maybe this is a feature request.

If flagger could route randomly to either the canary or the primary based off weights, and also target the canary with one cookie and the primary with another then I could do what Im attempting to do.

I basically have my application always responding with the set-cookie: x-version: abcdefg header (the version depending on what version of the application they are hitting), so if the initial request was routed through the weight based system the first response would set the cookie, then initial requests would always go to that same version because the cookie would be set from that moment forward.

Sorry if this is overly verbose, just want to make sure Im explaining clearly. Screen Shot 2022-05-13 at 9 32 27 AM

aryan9600 commented 2 years ago

I think there might be a misconception about how A/B tests actually work in Flagger. Flagger doesn't do any weighted routing when you want to do a A/B test. All requests are routed based only on the cookie/header. Normally, when both the primary and the canary deployments are the same, the cookie/header is pointless, because all requests will hit the primary deployment regardless. During an analysis run, all requests that match the header/cookies specified in the Canary spec, will hit the canary deployment, all other requests will go to the primary deploy.

So if during a Canary analysis, Don's request doesn't have a cookie set, it'll hit the primary, and your application should respond with set-cookie: x-version: primary-version. So now, the next request by Don will have the cookie as x-version: primary-version, but assuming, you defined the Canary to match cookies exactly against x-version: canary-version, the cookie in this request won't match, so Don will be routed to the primary deployment.

In order for a request to hit the canary deployment, you'd need to modify your application to either

Flagger currently cannot route requests based on both headers/cookies and weights.

Apollorion commented 2 years ago

Ah, I understand how A/B is supposed to work now. I think this might actually be a feature request, then.

I would like to not have to randomly set the set-cookie: x-version: canary-version header in my application via the primary, because that means the primary version of my application would need to be aware of the new canary version and I'd have to somehow smartly control weights myself.

The feature request would be: I would like to just have my application set the cookie to the version of the application the initial request made it to, if there is not cookie or a cookie that doesnt match then it would route through flaggers weight based systems until it actually found a matching cookie. Once it found the match, it would always route to that version of the application. Basically what I diagramed above.