Closed clivez closed 4 years ago
I'm not sure I understand the scenario TBH. Node restarts, Pods go to terminating, so far so good.
Cleaner cleans the old entries when the node comes back, which is expected from my perspective. A Pod is a Pod is a Pod - StatefulSet never guaranteed static address allocations. What do you mean when you say "Pods ss-cinfo-0 and ss-cinfo-4 are deleted forcely"?
This stands for the hardware broken scenario, when the hardware needs to be replaced, it may take a very long time, during this period the user may need to terminate the 'terminating' pods forcely to make sure new pods of the statefulset created and the service come back again. We never expect static address allocation here, what the problem here is: 5 pods running but only 4 danmeps and related IPs in the alloc, 1 released mistakenly by the cleaner.
no service's high-availability shall depend on one instance, especially in TelCo. we definitely need to educate our users not to mess with the API in these scenarios. there is no README yet but Cleaner is designed work in conjuction with normal K8s Pod termination life-cycle, so it will eventually reconcile state even without manual interaction, even in an outage scenario
that being said yeah it is possible something is up with the Pod UUIDs when it comes down to a statefulset, I will look into it to understand the scenario more!
what I don't understand is how could this call: https://github.com/nokia/danm-utils/blob/master/pkg/cleaner/cleaner.go#L99 result in an error if the Pod did exist
maybe it is a race condition between the new Pod starting up, and the already triggered cleanup procedure progressing to the deleteInterface phase. what is interesting is that you had two Pods evacuated from the same node, and the only one had this issue. what happens if you repeat this test 5-10 times? is the issue persistent, or intermittent?
After re-inspecting I'm getting more sure and sure it is a race condition. Based on the logs I think periodic cleaning started at 8:54:30, while possibly very close to that 8:54:29 the new Pod was being instantiated. During the ongoing CNI_ADD the DanmEp might have been already created, but the CNI_ADD operation was not yet finished. So when Cleaner activated it saw two DanmEps belonging to Pod0 (one olda, and one new), and one belonging to Pod4 (the old). It cleared all (we see three entries in Cleaner log). 2 cleanings were justified (old Pod0 and old Pod4), 1 was not (new Pod0).
I think the following upgrades are possibly needed:
I have a statefulset with the replicas 5.
Then the node 'clive-alp-worker-03' is shutdown.
Pods ss-cinfo-0 and ss-cinfo-4 are deleted forcely, after the pods recreated, danmep for the new ss-cinfo-0 is deleted by mistake and IP also released.
log from the cleaner pod
And the new ss-cinfo-0 is created at 2020-02-14T08:54:29Z