argoproj / argo-rollouts

Progressive Delivery for Kubernetes
https://argo-rollouts.readthedocs.io/
Apache License 2.0
2.67k stars 839 forks source link

Rollout confused after deploying during a canary phase #1372

Open DanTulovsky opened 3 years ago

DanTulovsky commented 3 years ago

Summary

I had a slow rollout progressing through about 8 steps, with a 30 min wait between each. Before it was finished, I pushed another change. This change create a new Revision, but then it got stuck. No pods were upgraded to the new revision, the previous revision and stable versions stopped progressing.

I unstuck it by running promote via the CLI tool. I ran it once, and the UI showed that it started the step over again. I then had to promote it through the entire workflow. No changes were being made to the newest revision. Once I went through the entire Rollout like this, the new revision started to upgrade.

Diagnostics

1.0.2

Logs for when in broken state:

ral metadata nil to Pod public-collector-saas-76f546dbbb-695fw" namespace=public rollout=public-collector-saas
time="2021-07-26T18:26:49Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:26:49Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-6rdk8" namespace=public rollout=public-collector-saas
time="2021-07-26T18:26:49Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:26:49Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7867g" namespace=public rollout=public-collector-saas
time="2021-07-26T18:26:51Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:26:51Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7c5hq" namespace=public rollout=public-collector-saas
time="2021-07-26T18:26:52Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:26:52Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7k2kn" namespace=public rollout=public-collector-saas
time="2021-07-26T18:26:52Z" level=error msg="roCtx.reconcile err Operation cannot be fulfilled on pods \"public-collector-saas-76f546dbbb-7k82b\": the object has been modified; please apply your changes to the latest version and try again" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas
time="2021-07-26T18:26:52Z" level=info msg="Reconciliation completed" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas time_ms=12004.444169999999
time="2021-07-26T18:26:52Z" level=error msg="rollout syncHandler error: Operation cannot be fulfilled on pods \"public-collector-saas-76f546dbbb-7k82b\": the object has been modified; please apply your changes to the latest version and try again" namespace=public rollout=public-collector-saas
time="2021-07-26T18:26:55Z" level=info msg="rollout syncHandler queue retries: 132 : key \"public/public-collector-saas\"" namespace=public rollout=public-collector-saas
E0726 18:26:55.602009       1 controller.go:174] Operation cannot be fulfilled on pods "public-collector-saas-76f546dbbb-7k82b": the object has been modified; please apply your changes to the latest version and try again
time="2021-07-26T18:26:55Z" level=info msg="Started syncing rollout" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas
time="2021-07-26T18:26:58Z" level=info msg="Started syncing Analysis at (2021-07-26 18:26:58.186391976 +0000 UTC m=+1195555.478448807)" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:26:58Z" level=info msg="running overdue measurement" analysisrun=public-collector-saas-76f546dbbb-6 metric=collector-span-forward-error-rate-3nZq4Tg3 namespace=public
time="2021-07-26T18:26:58Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-2k5tz" namespace=public rollout=public-collector-saas
time="2021-07-26T18:26:59Z" level=info msg="taking 1 measurements" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:00Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-45dfw" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:00Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4llfx" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:00Z" level=info msg="measurement completed Successful" analysisrun=public-collector-saas-76f546dbbb-6 metric=collector-span-forward-error-rate-3nZq4Tg3 namespace=public
time="2021-07-26T18:27:00Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4lzbh" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:00Z" level=info msg="enqueueing analysis after 14.702645447s" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:00Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4qkjp" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:00Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:00Z" level=info msg="Patch status successfully" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:00Z" level=info msg="Reconciliation completed" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public time_ms=2723.6030530000003
time="2021-07-26T18:27:00Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4r95w" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:01Z" level=info msg="Started syncing Analysis at (2021-07-26 18:27:01.003872416 +0000 UTC m=+1195558.295929302)" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:01Z" level=info msg="taking 0 measurements" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:01Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4s5sg" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:02Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:02Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4zjxj" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:02Z" level=info msg="enqueueing analysis after 12.696098652s" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:02Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-567gm" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:02Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:03Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5f8nc" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:02Z" level=info msg="No status changes. Skipping patch" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:03Z" level=info msg="Reconciliation completed" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public time_ms=2206.617236
time="2021-07-26T18:27:03Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5hrxp" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:03Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5rdl8" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:03Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5skn8" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:03Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5tgss" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:03Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5zvc4" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:04Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-664w6" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:04Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-695fw" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:05Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-6rdk8" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:05Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7867g" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:05Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7c5hq" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:05Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7k2kn" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:05Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7k82b" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:06Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7sbbf" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:06Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-89gsj" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:06Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-9ffk2" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:06Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-9kzmn" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:06Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-9mxqh" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:06Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-9ppqk" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:06Z" level=error msg="roCtx.reconcile err Operation cannot be fulfilled on pods \"public-collector-saas-76f546dbbb-bfb64\": the object has been modified; please apply your changes to the latest version and try again" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas
time="2021-07-26T18:27:06Z" level=info msg="Reconciliation completed" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas time_ms=11008.376663
time="2021-07-26T18:27:06Z" level=error msg="rollout syncHandler error: Operation cannot be fulfilled on pods \"public-collector-saas-76f546dbbb-bfb64\": the object has been modified; please apply your changes to the latest version and try again" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:08Z" level=info msg="rollout syncHandler queue retries: 136 : key \"public/public-collector-saas\"" namespace=public rollout=public-collector-saas
E0726 18:27:08.130956       1 controller.go:174] Operation cannot be fulfilled on pods "public-collector-saas-76f546dbbb-bfb64": the object has been modified; please apply your changes to the latest version and try again
time="2021-07-26T18:27:08Z" level=info msg="Started syncing rollout" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas
time="2021-07-26T18:27:10Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:10Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:11Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-2k5tz" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:11Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-45dfw" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:11Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4llfx" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:11Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4lzbh" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:12Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4qkjp" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:12Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4r95w" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:12Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4s5sg" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:12Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4zjxj" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:12Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-567gm" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:13Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5f8nc" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:14Z" level=error msg="roCtx.reconcile err Operation cannot be fulfilled on pods \"public-collector-saas-76f546dbbb-5hrxp\": the object has been modified; please apply your changes to the latest version and try again" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas
time="2021-07-26T18:27:14Z" level=info msg="Reconciliation completed" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas time_ms=6174.103142
time="2021-07-26T18:27:14Z" level=error msg="rollout syncHandler error: Operation cannot be fulfilled on pods \"public-collector-saas-76f546dbbb-5hrxp\": the object has been modified; please apply your changes to the latest version and try again" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:14Z" level=info msg="rollout syncHandler queue retries: 139 : key \"public/public-collector-saas\"" namespace=public rollout=public-collector-saas
E0726 18:27:14.504965       1 controller.go:174] Operation cannot be fulfilled on pods "public-collector-saas-76f546dbbb-5hrxp": the object has been modified; please apply your changes to the latest version and try again
time="2021-07-26T18:27:14Z" level=info msg="Started syncing rollout" generation=1283 namespace=public resourceVersion=3221670935 rollout=public-collector-saas
time="2021-07-26T18:27:15Z" level=info msg="Started syncing Analysis at (2021-07-26 18:27:15.405401452 +0000 UTC m=+1195572.697458337)" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:15Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:15Z" level=info msg="running overdue measurement" analysisrun=public-collector-saas-76f546dbbb-6 metric=collector-span-forward-error-rate-3nZq4Tg3 namespace=public
time="2021-07-26T18:27:15Z" level=info msg="taking 1 measurements" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:16Z" level=info msg="measurement completed Successful" analysisrun=public-collector-saas-76f546dbbb-6 metric=collector-span-forward-error-rate-3nZq4Tg3 namespace=public
time="2021-07-26T18:27:17Z" level=info msg="enqueueing analysis after 14.599142183s" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:17Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:17Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:17Z" level=info msg="Patch status successfully" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:18Z" level=info msg="Reconciliation completed" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public time_ms=2996.9229090000003
time="2021-07-26T18:27:18Z" level=info msg="Started syncing Analysis at (2021-07-26 18:27:18.508698992 +0000 UTC m=+1195575.800755935)" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:18Z" level=info msg="taking 0 measurements" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:18Z" level=info msg="enqueueing analysis after 12.195577065s" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:18Z" level=info msg="No status changes. Skipping patch" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:18Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-2k5tz" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:18Z" level=info msg="Reconciliation completed" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public time_ms=414.40803
time="2021-07-26T18:27:19Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-45dfw" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:20Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4llfx" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:21Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4lzbh" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:24Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:21Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4qkjp" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:31Z" level=info msg="Started syncing Analysis at (2021-07-26 18:27:31.107390411 +0000 UTC m=+1195588.399447258)" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:31Z" level=info msg="running overdue measurement" analysisrun=public-collector-saas-76f546dbbb-6 metric=collector-span-forward-error-rate-3nZq4Tg3 namespace=public
time="2021-07-26T18:27:28Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:31Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4r95w" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:31Z" level=info msg="taking 1 measurements" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:34Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4s5sg" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:34Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:35Z" level=info msg="measurement completed Successful" analysisrun=public-collector-saas-76f546dbbb-6 metric=collector-span-forward-error-rate-3nZq4Tg3 namespace=public
time="2021-07-26T18:27:36Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-4zjxj" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:38Z" level=info msg="enqueueing analysis after 12.032543563s" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:39Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-567gm" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:42Z" level=info msg="Patch status successfully" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:44Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5f8nc" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:46Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:46Z" level=info msg="Reconciliation completed" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public time_ms=14893.440416000001
time="2021-07-26T18:27:47Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5hrxp" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:48Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:49Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5rdl8" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:48Z" level=info msg="Started syncing Analysis at (2021-07-26 18:27:48.504854004 +0000 UTC m=+1195605.796910876)" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:51Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:53Z" level=info msg="running overdue measurement" analysisrun=public-collector-saas-76f546dbbb-6 metric=collector-span-forward-error-rate-3nZq4Tg3 namespace=public
time="2021-07-26T18:27:51Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5skn8" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:55Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:57Z" level=info msg="taking 1 measurements" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:27:57Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5tgss" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:58Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-5zvc4" namespace=public rollout=public-collector-saas
time="2021-07-26T18:27:58Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:58Z" level=info msg="Enqueueing parent of public/public-collector-saas-76f546dbbb: Rollout public/public-collector-saas"
time="2021-07-26T18:27:58Z" level=info msg="measurement completed Successful" analysisrun=public-collector-saas-76f546dbbb-6 metric=collector-span-forward-error-rate-3nZq4Tg3 namespace=public
time="2021-07-26T18:27:58Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-664w6" namespace=public rollout=public-collector-saas
time="2021-07-26T18:28:00Z" level=info msg="enqueueing analysis after 13.22868903s" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:28:01Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-695fw" namespace=public rollout=public-collector-saas
time="2021-07-26T18:28:01Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-6rdk8" namespace=public rollout=public-collector-saas
time="2021-07-26T18:28:01Z" level=info msg="Patch status successfully" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public
time="2021-07-26T18:28:01Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7867g" namespace=public rollout=public-collector-saas
time="2021-07-26T18:28:01Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7c5hq" namespace=public rollout=public-collector-saas
time="2021-07-26T18:28:02Z" level=info msg="synced ephemeral metadata nil to Pod public-collector-saas-76f546dbbb-7k2kn" namespace=public rollout=public-collector-saas
time="2021-07-26T18:28:01Z" level=info msg="Reconciliation completed" analysisrun=public-collector-saas-76f546dbbb-6 namespace=public time_ms=13199.910519

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

DanTulovsky commented 3 years ago

Additionally, the 'Steps' currently all show as complete. So I have the new Revision 7, the previous upgrade at Revision 6 and the previous stable at Revision 5.

What seems to be happening now is that all pods are being moved from Revision 6 to Revision 7 very quickly, without pausing. Revision 5 pods are not being touched yet.

Strategy:        Canary
  Step:          14/14
  SetWeight:     100
  ActualWeight:  100
jessesuen commented 3 years ago

Did you happen to go back to the previously deployed version or all these pod specs different from each other?

DanTulovsky commented 3 years ago

No... it was going forward..

On Mon, Jul 26, 2021, at 4:31 PM, Jesse Suen wrote:

Did you happen to go back to the previously deployed version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/argoproj/argo-rollouts/issues/1372#issuecomment-887006338, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAD4PP3OZQNYNLOQPBXCTPLTZXA2LANCNFSM5BAVRREA.

alexmt commented 3 years ago

I can reliable reproduce a similar issues with replicaset using the rollout + template below. Getting the following error:

ERRO[2021-08-25T12:46:30-07:00] roCtx.reconcile err Operation cannot be fulfilled on replicasets.apps "rollout-analysis-step-6d9cf96448": the object has been modified; please apply your changes to the latest version and try again  generation=2 namespace=default resourceVersion=497932 rollout=rollout-analysis-step
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate-qal-usw2
spec:
  args:
  - name: analysis-interval
    value: "300s"
  - name: analysis-runs
    value: "36"
  - name: failure-limit
    value: "4"
  - name: inconclusive-limit
    value: "4"
  - name: secret-url
    valueFrom:
      secretKeyRef:
        name: example-secret
        key: secretUrl
  metrics:
    - name: webmetric
      successCondition: result == 'It worked!'
      count: "{{args.analysis-runs}}"
      interval: "{{args.analysis-interval}}"
      failureLimit: "{{args.failure-limit}}"
      provider:
        web:
          # placeholders are resolved when an AnalysisRun is created
          url: "{{args.secret-url}}"
          jsonPath: "{$.message}"
---
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: rollout-analysis-step
spec:
  replicas: 4
  revisionHistoryLimit: 2
  selector:
    matchLabels:
      app: rollout-analysis-step
  template:
    metadata:
      labels:
        app: rollout-analysis-step
    spec:
      containers:
      - name: rollouts-demo
        image: argoproj/rollouts-demo:blue
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
  strategy:
    canary:
      steps:
      - setWeight: 25
      # An AnalysisTemplate is referenced at the second step, which starts an AnalysisRun after
      # the setWeight step. The rollout will not progress to the following step until the
      # AnalysisRun is complete. A failure/error of the analysis will cause the rollout's update to
      # abort, and set the canary weight to zero.
      - analysis:
          templates:
          - templateName: success-rate-qal-usw2

Issue is reproducible locally with k3s and rollout controller version: https://github.com/argoproj/argo-rollouts/commit/a601a0c18e657ae81a0269b229767dcbe9bbf56a

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no activity.

jayadeep-saaslabs commented 7 months ago

I faced same issue in our deployments. argo rollout status being "more replicas need to be updated" and in controller logs "Skip scale down of older RS 'service-123asd': still referenced" say this error

ju187 commented 7 months ago

Hit this error too. The rollout stuck at progressing waiting for more replicas