Closed r7vme closed 6 years ago
cc: @calvix @corest
may be one of you saw similar thing.
Here is the conditions from "affected" pod
- lastProbeTime: null
lastTransitionTime: 2018-02-13T20:09:56Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2018-02-13T20:11:28Z
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2018-02-13T20:09:56Z
status: "True"
type: PodScheduled
it became ready after 1m30s
Okay, looks like we need to replace this check https://github.com/giantswarm/kvm-operator-node-controller/blob/master/provider/instances.go#L71 with check that will check status is "running".
good catch
I thnik i know the reason.
The pod is probably ready after it passes health-check from kubelet.
There is few things that can make this delayed:
sometimes it happens that bridge interface is borked and doesn't have IP, so the kvm pod is waiting for IP on the interface. ( this will reconcile in time as soon as flannel-pod is killed because of flannel-healthcheck)
then this delay can cause that POD is ready after quite some time
sometimes it happens that bridge interface is borked and doesn't have IP, so the kvm pod is waiting for IP on the interface. ( this will reconcile in time as soon as flannel-pod is killed because of flannel-healthcheck)
In this case, vm will not be able to reach master vm, right?
In this case, VM will not be able to reach master VM, right?
No, this is all resolved, it will reconcile in time. The only issue is that it will delay the readiness of the pod.
ack. Looks like this is the reason. Anyway checking pod status (running or not) instead of readiness is the proper behavior.
If it is an important bug shouldn't it have a #postmortem tag and be in the product board inbox?
never mind me, linked somewhere else.
while rolling out nodes in on-prem cluster, i saw few times that fresh worker nodes got deleted almost immediately after they appear in guest cluster.
But all pods were in "Running" state. We need to precisely look on what is the difference between "Running" and "Ready".
Here is the output from pods conditions