The current implementation of the KubernetesMachine._wait_machines_startup method continuously loops on watch events from list_namespaced_pod. In specific cases, such as critical Pod errors (like CNI errors), no further events are generated.
Consequently, the for loop runs forever, causing the program to hang indefinitely.
To resolve this issue, it is necessary to introduce a mechanism that breaks the loop after a defined threshold. Our approach involves utilizing threading.Timer to establish a 3-minute timer. This timer will be reset upon receiving each new event. However, if no events occur within the 3-minute interval, the callback will be triggered, signaling an error and terminating the program.
The current implementation of the
KubernetesMachine._wait_machines_startup
method continuously loops on watch events fromlist_namespaced_pod
. In specific cases, such as critical Pod errors (like CNI errors), no further events are generated.Consequently, the
for
loop runs forever, causing the program to hang indefinitely.To resolve this issue, it is necessary to introduce a mechanism that breaks the loop after a defined threshold. Our approach involves utilizing
threading.Timer
to establish a 3-minute timer. This timer will be reset upon receiving each new event. However, if no events occur within the 3-minute interval, the callback will be triggered, signaling an error and terminating the program.