Closed jgwest closed 2 months ago
Hi @jgwest, I was not able to reproduce the issue. The canary weight in the Route was overwritten to zero once the Rollout was created. I followed the above example on OCP 4.16 with GitOps Operator v1.13.1. Looking at the logs, I could confirm that the Plugin API was called for the first time with the zero desired weight.
Logs from the plugin process:
2024-08-20T09:59:48.118Z [DEBUG] plugin.openshift: time=2024-08-20T09:59:48.118Z level=INFO msg="successfully updated route" plugin=trafficrouter vendor=openshift name=rollouts-demo-route weight=0
Logs from the rollouts controller:
time="2024-08-20T09:59:48Z" level=info msg="Previous weights: nil" namespace=argo-rollouts rollout=rollouts-demo
time="2024-08-20T09:59:48Z" level=info msg="New weights: &TrafficWeights{Canary:WeightDestination{Weight:0,ServiceName:argo-rollouts-canary-service,PodTemplateHash:687d76d795,},Stable:WeightDestination{Weight:100,ServiceName:argo-rollouts-stable-service,PodTemplateHash:,},Additional:[]WeightDestination{},Verified:nil,}" namespace=argo-rollouts rollout=rollouts-demo
time="2024-08-20T09:59:48Z" level=info msg="Traffic weight updated to 0" event_reason=TrafficWeightUpdated namespace=argo-rollouts rollout=rollouts-demo
time="2024-08-20T09:59:48Z" level=info msg="Event(v1.ObjectReference{Kind:\"Rollout\", Namespace:\"argo-rollouts\", Name:\"rollouts-demo\", UID:\"4beb21be-b62e-4e84-89e3-3c0dfd018b9d\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"104462\", FieldPath:\"\"}): type: 'Normal' reason: 'TrafficWeightUpdated' Traffic weight updated to 0"
The above logs were captured right after creating the Rollout resource and we can observe that the weight was updated to 0. Here's the code from the rollouts controller that calls the SetWeight() API.
I also tried the K8s Gateway plugin with Argo Rollouts and observed a similar behavior. I couldn't install the Gateway on OCP. But I tried the quick start guide on a kind cluster. I followed these steps:
go run ./cmd/rollouts-controller/main.go
backendRefs:
- name: argo-rollouts-stable-service
kind: Service
port: 80
weight: 50
- name: argo-rollouts-canary-service
kind: Service
port: 80
weight: 50
Included a simple e2e test to verify this behaviour: https://github.com/argoproj-labs/rollouts-plugin-trafficrouter-openshift/pull/13
Thanks @chetan-rns for the detailed investigation, for the walkthrough of the behaviour with Gateway API, and for the E2E test which will verify we don't regress this behaviour!
When a new Route is created with non-zero canary, and then a new Rollout is created targeting that Route, I would expect that the Route would set the canary weight to 0 on the Route (under
.spec.alternateBackends
).Instead, it appears that when the Rollout is initially created, it does not touch the Route's canary weight: after the Rollout is created, the Route still has non-zero canary weight.
These instructions are roughly based on this.
Steps to reproduce:
1) Login in to an OpenShift cluster, that has the latest Argo Rollouts operator running on that cluster in cluster-scoped mode.
2) Create the following resources:
NOTE: In the Route above, notice that the weight is 50/50 between canary/stable.
After you have created all of the above resources, I would expect that when the Rollout is reconciled by Argo Rollouts, Rollouts would call our plugin, and our plugin would set the Route canary weight to 0.
Instead, if you look at the
Route
, you will see it still has a canary weight of50
underalternateBackends
.But this is an assumption on my part that needs to be verified.
Issue criteria: