openkruise / kruise

Automated management of large-scale applications on Kubernetes (incubating project under CNCF)
https://openkruise.io
Other
4.6k stars 754 forks source link

[BUG] When 'ordinals' are set, the 'pratition' behaves differently during updates and deleting #1749

Open karlhjm opened 4 days ago

karlhjm commented 4 days ago

What happened:

After declaring the ordinal index, confusion occurred in the logical index recognition of pods when updating, scaling, or deleting pods with partition. eg. with rollingUpdate, when ordinal index=2, updating partition from 5 to 3, pod-3 still use old template, but pod-3 uses new template when recreating it.

What you expected to happen: In the above example, p3 should use old template, because of ordinals=2, the logic idx of p3 should be 3-2=1, which is smaller than partition=3

How to reproduce it (as minimally and precisely as possible):

ordinals=2, replicas=5, partition=7, old pods [p2, p3, p4, p5, p6] update sts template and change partition=5 nothing happend, old pods [p2, p3, p4, p5, p6] update partition=3, sts updates automatically, old pods [p2, p3, p4], new pods [p5, p6] then, delete p3 then it will be created with new template, old pods [p2, p4], new pods [p3, p5, p6] delete p4 then it will be created with new template, old pods [p2], new pods [p3, p4, p5, p6] delete p2 then it will be created with old template, old pods [p2], new pods [p3, p4, p5, p6]

Anything else we need to know?: Partitioned rolling updates in k8s https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#partitions

Environment: ubuntu 16.04

ABNER-1 commented 4 days ago

Thank you @karlhjm . I will test this case later

ABNER-1 commented 14 hours ago

Thank you @karlhjm for your case report. I have reproduced this case and found that it is caused by

https://github.com/openkruise/kruise/blob/7dcdf8d95191fd441a9af61c01945d9d7b6ae3bc/pkg/controller/statefulset/stateful_set_utils.go#L510-L517