some new worker pods detected as not ready and deleted

r7vme commented 6 years ago

while rolling out nodes in on-prem cluster, i saw few times that fresh worker nodes got deleted almost immediately after they appear in guest cluster.

{"caller":"github.com/giantswarm/kvm-operator-node-controller/provider/instances.go:60","info":"checking pod","namespace":"fm8y4","pod":"worker-$
02o6-3262462077-dkwrx","time":"18-02-13 20:02:58.257"}                                                                                           
{"caller":"github.com/giantswarm/kvm-operator-node-controller/provider/instances.go:73","info":"pod not ready","namespace":"fm8y4","pod":"worker-
j02o6-3262462077-dkwrx","time":"18-02-13 20:02:58.260"}
{"caller":"kvm-operator-node-controller/controller.go:257","info":"deleting node since it is no longer present in cloud provider","node":"worker-
j02o6-3262462077-dkwrx","time":"18-02-13 20:02:58.260"}
{"caller":"kvm-operator-node-controller/controller.go:276","info":"node state","node":"worker-j02o6-3262462077-dkwrx","state":"False","time":"18-
02-13 20:02:58.261"}

But all pods were in "Running" state. We need to precisely look on what is the difference between "Running" and "Ready".

Here is the output from pods conditions

  - lastProbeTime: null
    lastTransitionTime: 2018-02-13T20:09:05Z
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: 2018-02-13T20:09:19Z
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: 2018-02-13T20:09:05Z
    status: "True"
    type: PodScheduled

r7vme commented 6 years ago

cc: @calvix @corest

may be one of you saw similar thing.

r7vme commented 6 years ago

Here is the conditions from "affected" pod

  - lastProbeTime: null
    lastTransitionTime: 2018-02-13T20:09:56Z
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: 2018-02-13T20:11:28Z
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: 2018-02-13T20:09:56Z
    status: "True"
    type: PodScheduled

it became ready after 1m30s

r7vme commented 6 years ago

Okay, looks like we need to replace this check https://github.com/giantswarm/kvm-operator-node-controller/blob/master/provider/instances.go#L71 with check that will check status is "running".

calvix commented 6 years ago

good catch

calvix commented 6 years ago

I thnik i know the reason.

The pod is probably ready after it passes health-check from kubelet.

There is few things that can make this delayed:

https://github.com/giantswarm/kvm-operator/blob/master/service/kvmconfig/v4/key/key.go#L27 Initial delay is se to 60s since we are counting with booting time
sometimes it happens that bridge interface is borked and doesn't have IP, so the kvm pod is waiting for IP on the interface. ( this will reconcile in time as soon as flannel-pod is killed because of flannel-healthcheck)

then this delay can cause that POD is ready after quite some time

r7vme commented 6 years ago

sometimes it happens that bridge interface is borked and doesn't have IP, so the kvm pod is waiting for IP on the interface. ( this will reconcile in time as soon as flannel-pod is killed because of flannel-healthcheck)

In this case, vm will not be able to reach master vm, right?

calvix commented 6 years ago

In this case, VM will not be able to reach master VM, right?

No, this is all resolved, it will reconcile in time. The only issue is that it will delay the readiness of the pod.

r7vme commented 6 years ago

ack. Looks like this is the reason. Anyway checking pod status (running or not) instead of readiness is the proper behavior.

othylmann commented 6 years ago

If it is an important bug shouldn't it have a #postmortem tag and be in the product board inbox?

othylmann commented 6 years ago

never mind me, linked somewhere else.

giantswarm / kvm-operator-node-controller

some new worker pods detected as not ready and deleted #12