Closed dgkanatsios closed 2 years ago
Unfortunately this doesn't seem to be the case. Trying manual evictions gives the following:
12s Normal GameServerProcessExited gameserver/gameserverbuild-sample-netcore-42kwd GameServer process exited with code 0 and reason Completed v1.PodStatus{Phase:"Succeeded", Conditions:[]v1.PodCondition{v1.PodCondition{Type:"Initialized", Status:"True", LastProbeTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), LastTransitionTime:time.Date(2022, time.September, 25, 3, 34, 3, 0, time.Local), Reason:"PodCompleted", Message:""}, v1.PodCondition{Type:"Ready", Status:"False", LastProbeTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), LastTransitionTime:time.Date(2022, time.September, 25, 3, 36, 11, 0, time.Local), Reason:"PodCompleted", Message:""}, v1.PodCondition{Type:"ContainersReady", Status:"False", LastProbeTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), LastTransitionTime:time.Date(2022, time.September, 25, 3, 36, 11, 0, time.Local), Reason:"PodCompleted", Message:""}, v1.PodCondition{Type:"PodScheduled", Status:"True", LastProbeTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), LastTransitionTime:time.Date(2022, time.September, 25, 3, 34, 2, 0, time.Local), Reason:"", Message:""}}, Message:"", Reason:"", NominatedNodeName:"", HostIP:"172.18.0.2", PodIP:"10.244.3.33", PodIPs:[]v1.PodIP{v1.PodIP{IP:"10.244.3.33"}}, StartTime:time.Date(2022, time.September, 25, 3, 34, 2, 0, time.Local), InitContainerStatuses:[]v1.ContainerStatus{v1.ContainerStatus{Name:"initcontainer", State:v1.ContainerState{Waiting:(*v1.ContainerStateWaiting)(nil), Running:(*v1.ContainerStateRunning)(nil), Terminated:(*v1.ContainerStateTerminated)(0xc00031a850)}, LastTerminationState:v1.ContainerState{Waiting:(*v1.ContainerStateWaiting)(nil), Running:(*v1.ContainerStateRunning)(nil), Terminated:(*v1.ContainerStateTerminated)(nil)}, Ready:true, RestartCount:0, Image:"docker.io/library/thundernetes-initcontainer:36d21df", ImageID:"sha256:d681c5a6342f01a1686b83c10fb28e7da73c297ae874eb557f06e15cf4e3e955", ContainerID:"containerd://a6732cef26cf19d36ce5ee0ac4053e0506dc07ca4fb20f48bea29fd3ff7759d8", Started:(*bool)(nil)}}, ContainerStatuses:[]v1.ContainerStatus{v1.ContainerStatus{Name:"thundernetes-sample-netcore", State:v1.ContainerState{Waiting:(*v1.ContainerStateWaiting)(nil), Running:(*v1.ContainerStateRunning)(nil), Terminated:(*v1.ContainerStateTerminated)(0xc00031a8c0)}, LastTerminationState:v1.ContainerState{Waiting:(*v1.ContainerStateWaiting)(nil), Running:(*v1.ContainerStateRunning)(nil), Terminated:(*v1.ContainerStateTerminated)(nil)}, Ready:false, RestartCount:0, Image:"ghcr.io/playfab/thundernetes-netcore:0.5.0", ImageID:"ghcr.io/playfab/thundernetes-netcore@sha256:a65af58caec93940e263cf85e669a4925e110bc2b3d1e5565ec2ab13643e3fe4", ContainerID:"containerd://8a30b30a42fda5a7ae4859d1b25c9f9a97c3607249359b6e8837179e11ba4f46", Started:(*bool)(0xc000ce3445)}}, QOSClass:"Burstable", EphemeralContainerStatuses:[]v1.ContainerStatus(nil)}
"Evicted" is listed in the kubectl get events
but not sure how to parse it using the API. At the same time, controller detects that the Pod is not ready so it will delete it so impact is only on the reporting side. Closing till we find a better way.
Currently, the GameServer controller does not track properly when a Pod has been evicted (e.g. due to pressure in the Node). When the Pod is evicted, the container status is Failed and the Reason is Evicted. We should log it appropriately and emit a metric.
Relevant code: https://github.com/PlayFab/thundernetes/blob/544eca21b35a387692fd50ad45ae52e226dfa96c/pkg/operator/controllers/gameserver_controller.go#L178