argoproj-labs / rollouts-plugin-trafficrouter-gatewayapi

The Argo Rollouts plugin implementing the Kubernetes Gateway API specification for using different traffic providers in progressive delivery scenarios
https://rollouts-plugin-trafficrouter-gatewayapi.readthedocs.io/en/latest/
Apache License 2.0
99 stars 19 forks source link

Inconsistent HTTPRoute Reference in docs/features/advanced-deployments.md #88

Open zackijack opened 1 month ago

zackijack commented 1 month ago

Checklist:

Describe the bug

In the Advanced Deployment methods doc, three HTTPRoute resources are introduced at the beginning (canary-route, always-old-version, always-new-version). These routes are crucial for implementing different deployment scenarios. However, in the Rollout example provided later in the document, a different HTTPRoute (argo-rollouts-http-route) is used. This inconsistency makes it unclear how to correctly implement the example. When I tried using the three initially mentioned HTTPRoute resources, I encountered the following error: failed to set weight via plugin: backendRef was not found in httpRoute . I suspect this error occurs because the always-old-version and always-new-version routes only have one service each.

To Reproduce

  1. Navigate to the Advanced Deployment methods document.
  2. Observe the three HTTPRoute resources introduced initially.
  3. Scroll down to the Rollout example and notice the use of a different HTTPRoute.
  4. Attempt to implement the example using the three initially mentioned HTTPRoute resources.
  5. Encounter the error: failed to set weight via plugin: backendRef was not found in httpRoute.

Expected behavior

The Rollout example should either:

Screenshots

Screenshot 2024-10-15 at 15 06 08

Version

Argo Rollouts Controller: v1.7.2 Argo Rollouts Gateway API Plugin: v0.4.0

Logs

# Paste the logs from the rollout controller

# Logs for the entire controller:
$ kubectl logs -n argo-rollouts deployment/argo-rollouts

Found 2 pods, using pod/argo-rollouts-7fdbf86478-fcvc4
time="2024-10-15T07:10:15Z" level=info msg="Argo Rollouts starting" version=v1.7.2+59e5bd3
time="2024-10-15T07:10:16Z" level=info msg="Creating event broadcaster"
time="2024-10-15T07:10:16Z" level=info msg="Setting up event handlers"
time="2024-10-15T07:10:16Z" level=info msg="Setting up experiments event handlers"
time="2024-10-15T07:10:16Z" level=info msg="Setting up analysis event handlers"
time="2024-10-15T07:10:16Z" level=info msg="Downloading plugin argoproj-labs/gatewayAPI from: https://github.com/argoproj-labs/rollouts-plugin-trafficrouter-gatewayapi/releases/download/v0.4.0/gatewayapi-plugin-linux-amd64"
time="2024-10-15T07:10:18Z" level=info msg="Download complete, it took 1.657087844s"
time="2024-10-15T07:10:18Z" level=info msg="Starting Healthz Server at 0.0.0.0:8080"
time="2024-10-15T07:10:18Z" level=info msg="Leaderelection get id argo-rollouts-7fdbf86478-fcvc4_31dc050b-fc19-493c-94fb-87181224f445"
time="2024-10-15T07:10:18Z" level=info msg="attempting to acquire leader lease argo-rollouts/argo-rollouts-controller-lock..."
time="2024-10-15T07:10:18Z" level=info msg="Starting Metric Server at 0.0.0.0:8090"
time="2024-10-15T07:10:18Z" level=info msg="New leader elected: argo-rollouts-577b4779d8-b8brc_b000c302-b6cd-42f6-8b71-787e401ca481"
time="2024-10-15T07:10:51Z" level=info msg="successfully acquired lease argo-rollouts/argo-rollouts-controller-lock"
time="2024-10-15T07:10:51Z" level=info msg="New leader elected: argo-rollouts-7fdbf86478-fcvc4_31dc050b-fc19-493c-94fb-87181224f445"
time="2024-10-15T07:10:51Z" level=info msg="I am the new leader: argo-rollouts-7fdbf86478-fcvc4_31dc050b-fc19-493c-94fb-87181224f445"
time="2024-10-15T07:10:51Z" level=info msg="Starting Controllers"
time="2024-10-15T07:10:51Z" level=info msg="Rollout resource added to informer: trial-error/rollouts-demo" event_reason=RolloutAddedToInformer namespace=trial-error rollout=rollouts-demo
time="2024-10-15T07:10:51Z" level=info msg="invalidated cache for resource in namespace: argo-rollouts with the name: argo-rollouts-notification-configmap"
time="2024-10-15T07:10:51Z" level=info msg="Event(v1.ObjectReference{Kind:\"Rollout\", Namespace:\"trial-error\", Name:\"rollouts-demo\", UID:\"71664748-8ad9-41c6-b4a5-127e969c3b4c\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"2352781705\", FieldPath:\"\"}): type: 'Normal' reason: 'RolloutAddedToInformer' Rollout resource added to informer: trial-error/rollouts-demo"
time="2024-10-15T07:10:51Z" level=info msg="Waiting for controller's informer caches to sync"
time="2024-10-15T07:10:52Z" level=info msg="Enqueueing parent of trial-error/rollouts-demo-84bf8597bc: Rollout trial-error/rollouts-demo"
time="2024-10-15T07:10:52Z" level=info msg="Started controller"
time="2024-10-15T07:10:52Z" level=warning msg="Controller is running."
time="2024-10-15T07:10:52Z" level=info msg="Start processing" resource=trial-error/rollouts-demo
time="2024-10-15T07:10:52Z" level=info msg="Starting Rollout workers"
time="2024-10-15T07:10:52Z" level=info msg="Started rollout workers"
time="2024-10-15T07:10:52Z" level=info msg="Processing completed" resource=trial-error/rollouts-demo
time="2024-10-15T07:10:52Z" level=info msg="Starting analysis workers"
time="2024-10-15T07:10:52Z" level=info msg="Started 30 analysis workers"
time="2024-10-15T07:10:52Z" level=info msg="Starting Service workers"
time="2024-10-15T07:10:52Z" level=info msg="Started Service workers"
time="2024-10-15T07:10:52Z" level=info msg="Starting Ingress workers"
time="2024-10-15T07:10:52Z" level=info msg="Started Ingress workers"
time="2024-10-15T07:10:52Z" level=info msg="Starting Experiment workers"
time="2024-10-15T07:10:52Z" level=info msg="Started Experiment workers"
time="2024-10-15T07:10:52Z" level=info msg="Istio detected"
time="2024-10-15T07:10:52Z" level=info msg="Starting istio workers"
time="2024-10-15T07:10:52Z" level=info msg="Istio workers (10) started"
time="2024-10-15T07:10:52Z" level=info msg="Started syncing rollout" generation=1 namespace=trial-error resourceVersion=2352781705 rollout=rollouts-demo
time="2024-10-15T07:10:52Z" level=info msg="delaying service switch from  to 84bf8597bc: ReplicaSet not fully available" namespace=trial-error rollout=rollouts-demo service=argo-rollouts-canary-service
2024-10-15T07:10:52.705Z [DEBUG] plugin: starting plugin: path=/home/argo-rollouts/plugin-bin/argoproj-labs/gatewayAPI args=[/home/argo-rollouts/plugin-bin/argoproj-labs/gatewayAPI]
2024-10-15T07:10:52.706Z [DEBUG] plugin: plugin started: path=/home/argo-rollouts/plugin-bin/argoproj-labs/gatewayAPI pid=13
2024-10-15T07:10:52.706Z [DEBUG] plugin: waiting for RPC address: plugin=/home/argo-rollouts/plugin-bin/argoproj-labs/gatewayAPI
2024-10-15T07:10:52.734Z [DEBUG] plugin: using plugin: version=1
2024-10-15T07:10:52.734Z [DEBUG] plugin.gatewayAPI: plugin address: address=/tmp/plugin221527761 network=unix timestamp=2024-10-15T07:10:52.734Z

# Logs for a specific rollout:
$ kubectl logs -n argo-rollouts deployment/argo-rollouts | grep "rollouts-demo"

time="2024-10-15T07:13:39Z" level=info msg="Started syncing rollout" generation=1 namespace=trial-error resourceVersion=2352781705 rollout=rollouts-demo
time="2024-10-15T07:13:39Z" level=info msg="delaying service switch from  to 84bf8597bc: ReplicaSet not fully available" namespace=trial-error rollout=rollouts-demo service=argo-rollouts-canary-service
time="2024-10-15T07:13:39Z" level=info msg="Found 1 TrafficRouting Reconcilers" namespace=trial-error rollout=rollouts-demo
time="2024-10-15T07:13:39Z" level=info msg="Reconciling TrafficRouting with type 'GatewayAPI'" namespace=trial-error rollout=rollouts-demo
time="2024-10-15T07:13:39Z" level=warning msg="failed to set weight via plugin: httproutes.gateway.networking.k8s.io \"canary-route\" not found" event_reason=TrafficRoutingError namespace=trial-error rollout=rollouts-demo
time="2024-10-15T07:13:39Z" level=error msg="roCtx.reconcile err failed to set weight via plugin: httproutes.gateway.networking.k8s.io \"canary-route\" not found" generation=1 namespace=trial-error resourceVersion=2352781705 rollout=rollouts-demo
time="2024-10-15T07:13:39Z" level=info msg="Reconciliation completed" generation=1 namespace=trial-error resourceVersion=2352781705 rollout=rollouts-demo time_ms=8.045907999999999
time="2024-10-15T07:13:39Z" level=error msg="rollout syncHandler error: failed to set weight via plugin: httproutes.gateway.networking.k8s.io \"canary-route\" not found" namespace=trial-error rollout=rollouts-demo
time="2024-10-15T07:13:39Z" level=info msg="rollout syncHandler queue retries: 30 : key \"trial-error/rollouts-demo\"" namespace=trial-error rollout=rollouts-demo
time="2024-10-15T07:13:39Z" level=info msg="Event(v1.ObjectReference{Kind:\"Rollout\", Namespace:\"trial-error\", Name:\"rollouts-demo\", UID:\"71664748-8ad9-41c6-b4a5-127e969c3b4c\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"2352781705\", FieldPath:\"\"}): type: 'Warning' reason: 'TrafficRoutingError' failed to set weight via plugin: httproutes.gateway.networking.k8s.io \"canary-route\" not found"

Message from the maintainers:

Impacted by this bug? Give it a πŸ‘. We prioritize the issues with the most πŸ‘.

kostis-codefresh commented 1 month ago

Hello

The documentation page shows TWO standalone examples as "advanced" deployments

One example is "Pinning clients to a specific version". The other example is "Making applications "canary-aware". You are expected to use either one of them. Not both of them at the same time.

I will update the documentation page to make it more clear soon.

In the meantime here is a full example for the first scenario https://github.com/kostis-codefresh/rollouts-header-routing-example/tree/main/static-routing

And a full example for the second scenario https://github.com/kostis-codefresh/argo-rollouts-stateful-example/tree/main/manifests/stateful-rollout

Let me know if that helps.

zackijack commented 1 month ago

Thank you, @kostis-codefresh!

I appreciate the clarification and the additional examples. This helps a lot in understanding the correct implementation.

Sorry for my misunderstanding.

Thanks again for your support!