Kubernetes startup watch may never terminate if there is a Pod error

The current implementation of the KubernetesMachine._wait_machines_startup method continuously loops on watch events from list_namespaced_pod. In specific cases, such as critical Pod errors (like CNI errors), no further events are generated.

Consequently, the for loop runs forever, causing the program to hang indefinitely.

To resolve this issue, it is necessary to introduce a mechanism that breaks the loop after a defined threshold. Our approach involves utilizing threading.Timer to establish a 3-minute timer. This timer will be reset upon receiving each new event. However, if no events occur within the 3-minute interval, the callback will be triggered, signaling an error and terminating the program.

KatharaFramework / Kathara

Kubernetes startup watch may never terminate if there is a Pod error #258