Closed dpiraud closed 2 years ago
am unable to reproduce the scenario. From the logs, I dont see scale down delay set to the old replicaSet. is the service switch happening correctly and traffic is going to the new stack?
can you also describe old replicaSet as well?
The following lines appear repeatedly and look suspicious
time="2021-11-02T13:52:30Z" level=info msg="Reconciling stable ReplicaSet 'blue-green-demo-api-6765df48f4'" namespace=staging-exploitation rollout=blue-green-demo-api
time="2021-11-02T13:52:30Z" level=info msg="Reconciling 1 old ReplicaSets (total pods: 5)" namespace=staging-exploitation rollout=blue-green-demo-api
if the rollout still exists, could you paste the output of kubectl get rollout <rollout name> -o yaml
and the kubectl get replicaset <old replicaset> -o yaml
. Would like to see if the replicaset has any abnormal status. Thanks.
am unable to reproduce the scenario. From the logs, I dont see scale down delay set to the old replicaSet. is the service switch happening correctly and traffic is going to the new stack?
yes, traffic to new version is OK
The following lines appear repeatedly and look suspicious
time="2021-11-02T13:52:30Z" level=info msg="Reconciling stable ReplicaSet 'blue-green-demo-api-6765df48f4'" namespace=staging-exploitation rollout=blue-green-demo-api time="2021-11-02T13:52:30Z" level=info msg="Reconciling 1 old ReplicaSets (total pods: 5)" namespace=staging-exploitation rollout=blue-green-demo-api
if the rollout still exists, could you paste the output of
kubectl get rollout <rollout name> -o yaml
and thekubectl get replicaset <old replicaset> -o yaml
. Would like to see if the replicaset has any abnormal status. Thanks.
I have to replay it this mornnig, here are the describe without "private" part, logs looks similar
kubectl argo rollouts get rollout blue-green-demo-api -w
Name: blue-green-demo-api
Namespace: staging-exploitation
Status: ✔ Healthy
Strategy: BlueGreen
Images: xxx/canary-demo-api:blue
xxx/canary-demo-api:green (stable, active)
Replicas:
Desired: 5
Current: 10
Updated: 5
Ready: 5
Available: 5
NAME KIND STATUS AGE INFO
⟳ blue-green-demo-api Rollout ✔ Healthy 26m
├──# revision:2
│ └──⧉ blue-green-demo-api-6765df48f4 ReplicaSet ✔ Healthy 21m stable,active
│ ├──□ blue-green-demo-api-6765df48f4-hp9cf Pod ✔ Running 21m ready:2/2
│ ├──□ blue-green-demo-api-6765df48f4-2qxsc Pod ✔ Running 21m ready:2/2
│ ├──□ blue-green-demo-api-6765df48f4-67bqb Pod ✔ Running 21m ready:2/2
│ ├──□ blue-green-demo-api-6765df48f4-5xj5j Pod ✔ Running 21m ready:2/2
│ └──□ blue-green-demo-api-6765df48f4-dnrb6 Pod ✔ Running 21m ready:2/2
└──# revision:1
└──⧉ blue-green-demo-api-6695fbf66b ReplicaSet ✔ Healthy 26m
├──□ blue-green-demo-api-6695fbf66b-tjrsm Pod ✔ Running 26m ready:2/2
├──□ blue-green-demo-api-6695fbf66b-wlrkq Pod ✔ Running 26m ready:2/2
├──□ blue-green-demo-api-6695fbf66b-7j9pn Pod ✔ Running 26m ready:2/2
├──□ blue-green-demo-api-6695fbf66b-tfwdq Pod ✔ Running 26m ready:2/2
└──□ blue-green-demo-api-6695fbf66b-9jsbk Pod ✔ Running 26m ready:2/2
kubectl get rollout blue-green-demo-api -o yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: ...
rollout.argoproj.io/revision: "2"
creationTimestamp: "2021-11-03T08:25:31Z"
generation: 4
name: blue-green-demo-api
namespace: staging-exploitation
resourceVersion: "493003247"
uid: 208364ff-0004-49fe-b8b6-13934713fbd5
spec:
replicas: 5
selector:
matchLabels:
app: blue-green-demo-api
strategy:
blueGreen:
activeService: blue-green-demo-api
autoPromotionEnabled: true
scaleDownDelaySeconds: 10
template:
...
status:
HPAReplicas: 5
availableReplicas: 5
blueGreen:
activeSelector: 6765df48f4
canary: {}
conditions:
- lastTransitionTime: "2021-11-03T08:26:17Z"
lastUpdateTime: "2021-11-03T08:26:17Z"
message: Rollout has minimum availability
reason: AvailableReason
status: "True"
type: Available
- lastTransitionTime: "2021-11-03T08:31:22Z"
lastUpdateTime: "2021-11-03T08:31:22Z"
message: RolloutCompleted
reason: RolloutCompleted
status: "True"
type: Completed
- lastTransitionTime: "2021-11-03T08:25:31Z"
lastUpdateTime: "2021-11-03T08:31:22Z"
message: ReplicaSet "blue-green-demo-api-6765df48f4" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
currentPodHash: 6765df48f4
observedGeneration: "4"
phase: Healthy
readyReplicas: 5
replicas: 10
selector: app=blue-green-demo-api,rollouts-pod-template-hash=6765df48f4
stableRS: 6765df48f4
updatedReplicas: 5
kubectl get replicaset blue-green-demo-api-6695fbf66b -o yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
annotations:
rollout.argoproj.io/desired-replicas: "5"
rollout.argoproj.io/revision: "1"
creationTimestamp: "2021-11-03T08:25:31Z"
generation: 2
labels:
rollouts-pod-template-hash: 6695fbf66b
name: blue-green-demo-api-6695fbf66b
namespace: staging-exploitation
ownerReferences:
- apiVersion: argoproj.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Rollout
name: blue-green-demo-api
uid: 208364ff-0004-49fe-b8b6-13934713fbd5
resourceVersion: "492999801"
uid: 7483d383-fa3e-491b-beb7-072f74908468
spec:
replicas: 5
selector:
matchLabels:
app: blue-green-demo-api
rollouts-pod-template-hash: 6695fbf66b
template:
...
status:
availableReplicas: 5
fullyLabeledReplicas: 5
observedGeneration: 2
readyReplicas: 5
replicas: 5
kubectl get replicaset blue-green-demo-api-6765df48f4 -o yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
annotations:
rollout.argoproj.io/desired-replicas: "5"
rollout.argoproj.io/revision: "2"
creationTimestamp: "2021-11-03T08:30:40Z"
generation: 2
labels:
rollouts-pod-template-hash: 6765df48f4
name: blue-green-demo-api-6765df48f4
namespace: staging-exploitation
ownerReferences:
- apiVersion: argoproj.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Rollout
name: blue-green-demo-api
uid: 208364ff-0004-49fe-b8b6-13934713fbd5
resourceVersion: "493003244"
uid: aa8645b7-c48e-498b-9cb0-8fa20cb0a661
spec:
replicas: 5
selector:
matchLabels:
app: blue-green-demo-api
rollouts-pod-template-hash: 6765df48f4
template:
...
status:
availableReplicas: 5
fullyLabeledReplicas: 5
observedGeneration: 2
readyReplicas: 5
replicas: 5
EDIT I just add the log to this comment. Looking closely in it, i found lines like
...
time="2021-11-03T08:12:59Z" level=error msg="rollout syncHandler error: Operation cannot be fulfilled on replicasets.apps \"blue-green-demo-api-6695fbf66b\": the object has been modified; please apply your changes to the latest version and try again" namespace=staging-exploitation rollout=blue-green-demo-api
...
E1103 08:12:59.297637 1 controller.go:174] Operation cannot be fulfilled on replicasets.apps "blue-green-demo-api-6695fbf66b": the object has been modified; please apply your changes to the latest version and try again
...
Could it be the problem?
@dpiraud , those logs definitely provide some clue about this issue, which indicates the controller can't add the scaledown annotation to the old RS due to some interruption. Since this problem seems reproducible in your setup, I am wondering if you could try a simple b/g rollout like in https://argoproj.github.io/argo-rollouts/features/bluegreen/.
BTW, I really like your traffic monitoring dashboard (https://user-images.githubusercontent.com/19174502/140032466-3e80b331-569b-44b5-aeee-93b524f7c372.png). Could you let me know how you generate the data? Thanks.
hi @huikang
I can't run this example "out of the box" on our platform but i think my demo is very very close to this example. I'll dig around this log with our k8s provider and keep you inform if I found the problem
Thx for the dashboard, simple => My API log each call with the "color" of the version. Log goes in elastic search with a grafana on top of it. I use "vegeta" to generate traffic
Hi all
After diging with our k8s provider, we found that the argo rollout user was not allowed to patch/delete replicaset.
The error above (Operation cannot be fulfilled...) wasn't part of out problem as it remains after fixing the authorizations.
Could-it be a good idea to catch and log this error on missing authorizations ?
No bug => I close this issue
BR
Summary
What happened/what you expected to happen?
Summary
Having a blue/green demo app, old RS in not scaled down after rollout success.
Diagnostics
Version:
1.1.0
Configuration
Rollout view
Logs
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.