Closed katsew closed 4 months ago
GameServer is expected to create a new Pod if a Pod fails due to reasons of OutOfpods.
Sorry, but maybe I'm missing something - but how is Agones supposed to create a new Pod if there isn't room in the cluster?
Or do you mean that Agones doesn't recover when there should be room?
I'm also concerned that if you delete all the pods in the kube-system
namespace, you are also breaking Kubernetes.
GameServer is expected to create a new Pod if a Pod fails due to reasons of OutOfpods.
Sorry, but maybe I'm missing something - but how is Agones supposed to create a new Pod if there isn't room in the cluster?
Or do you mean that Agones doesn't recover when there should be room?
I mean that Agones doesn't recover when there should be room.
I'm also concerned that if you delete all the pods in the
kube-system
namespace, you are also breaking Kubernetes.
Usually, if a Node has exceeded the Pod's capacity and there are no other Nodes that can be scheduled, the Pod's status is stuck at Pending. Since failure by OutOfpods is rare, it was likely necessary to put the cluster in a broken state in order to reproduce it. As far as I could tell, simply deleting the GameServer did not reproduce it.
I have updated the reproduction procedure to be more accurate. By forcibly deleting pods, it can be reproduced in one attempt.
I'm curious - what does kubectl delete pod --force --grace-period=0 --all -n kube-system
force to happen?
Is that a required step to replicate the issue?
Also, can you share a kubectl describe
of the Pod that has failed please as well?
It sounds like we should move the GameServer to Unhealthy if the backing pod moves to an OutOfpods
state, but I'm just trying to nail down exactly what is happening here.
I'm curious - what does
kubectl delete pod --force --grace-period=0 --all -n kube-system
force to happen?Is that a required step to replicate the issue?
I tried forcibly deleting only the GameServer pods, but still could not reproduce the problem. So I checked which kube-system component is actually causing this OutOfpods. As a result, kube-proxy is causing this problem.
I have updated the confirmed reproduction method.
Also, can you share a
kubectl describe
of the Pod that has failed please as well?
Sure, here it is.
Name: simple-game-server-qxtcq-wbs6p
Namespace: default
Priority: 0
Node: gke-friday-developme-gameserver-pool--f788a1c2-szxx/
Start Time: Thu, 28 Jul 2022 03:15:22 +0000
Labels: agones.dev/gameserver=simple-game-server-qxtcq-wbs6p
agones.dev/role=gameserver
Annotations: agones.dev/container: simple-game-server
agones.dev/sdk-version: 1.20.0
cluster-autoscaler.kubernetes.io/safe-to-evict: false
Status: Failed
Reason: OutOfpods
Message: Pod Node didn't have enough resource: pods, requested: 1, used: 32, capacity: 32
IP:
IPs: <none>
Controlled By: GameServer/simple-game-server-qxtcq-wbs6p
Containers:
agones-gameserver-sidecar:
Image: gcr.io/agones-images/agones-sdk:1.20.0
Port: <none>
Host Port: <none>
Args:
--grpc-port=9357
--http-port=9358
Requests:
cpu: 30m
Liveness: http-get http://:8080/healthz delay=3s timeout=1s period=3s #success=1 #failure=3
Environment:
GAMESERVER_NAME: simple-game-server-qxtcq-wbs6p
POD_NAMESPACE: default (v1:metadata.namespace)
FEATURE_GATES: CustomFasSyncInterval=false&Example=true&NodeExternalDNS=true&PlayerAllocationFilter=false&PlayerTracking=false&SDKGracefulTermination=false&StateAllocationFilter=false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s4rzz (ro)
simple-game-server:
Image: gcr.io/agones-images/simple-game-server:0.13
Port: 7654/UDP
Host Port: 7258/UDP
Requests:
cpu: 0
memory: 0
Liveness: http-get http://:8080/gshealthz delay=5s timeout=1s period=5s #success=1 #failure=3
Environment:
AGONES_SDK_GRPC_PORT: 9357
AGONES_SDK_HTTP_PORT: 9358
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from empty (ro)
Volumes:
empty:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-s4rzz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m44s default-scheduler 0/1 nodes are available: 1 Too many pods.
Warning FailedScheduling 2m42s default-scheduler 0/1 nodes are available: 1 Too many pods.
Normal Scheduled 43s default-scheduler Successfully assigned default/simple-game-server-qxtcq-wbs6p to gke-friday-developme-gameserver-pool--f788a1c2-szxx
Normal NotTriggerScaleUp 2m44s cluster-autoscaler pod didn't trigger scale-up:
Warning OutOfpods 44s kubelet Node didn't have enough resource: pods, requested: 1, used: 32, capacity: 32
Can you reproduce the issue without actively deleting Kubernetes components?
Today I tried to reproduce without forcibly deleting kube-proxy, but I couldn't.
First, I deploy static pod which has similar spec with kube-proxy pod, then delete forcibly, and didn't work.
Here is the set of specs I try to align with kube-proxy.
The actual Pod resource is this:
apiVersion: v1
kind: Pod
metadata:
name: nginx
namespace: kube-system
labels:
component: nginx
tier: node
ownerReferences:
- apiVersion: v1
controller: true
kind: Node
name: gke-friday-developme-gameserver-pool--f788a1c2-pdj1
uid: 37d22a61-6e19-4729-bf3e-86a8823c9215
spec:
containers:
- name: nginx
image: nginx:1.14.2
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
nodeName: gke-friday-developme-gameserver-pool--f788a1c2-pdj1
priorityClassName: system-node-critical
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists
Next, I tried setting pods to 0 in ResourceQuota and forcibly deleting GameServer, which also did not work.
# deploy to the same namespace as GameServers exist
apiVersion: v1
kind: ResourceQuota
metadata:
name: pod-counts
namespace: default
spec:
hard:
pods: "0"
Here is the log for kubelet.
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.860377 14147 kubelet.go:1950] "SyncLoop DELETE" source="api" pods=[kube-system/kube-proxy-gke-friday-developme-gameserver-pool--f788a1c2-pdj1]
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.860644 14147 kubelet_pods.go:1520] "Generating pod status" pod="kube-system/kube-proxy-gke-friday-developme-gameserver-pool--f788a1c2-pdj1"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.861193 14147 kubelet.go:1668] "Trying to delete pod" pod="kube-system/kube-proxy-gke-friday-developme-gameserver-pool--f788a1c2-pdj1" podUID=9c095625-5113-476e-a638-bc8a78a9271b
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.861335 14147 mirror_client.go:125] "Deleting a mirror pod" pod="kube-system/kube-proxy-gke-friday-developme-gameserver-pool--f788a1c2-pdj1" podUID=0xc000b4b060
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.872105 14147 config.go:278] "Setting pods for source" source="api"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.873388 14147 kubelet.go:1944] "SyncLoop REMOVE" source="api" pods=[kube-system/kube-proxy-gke-friday-developme-gameserver-pool--f788a1c2-pdj1]
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.883964 14147 config.go:278] "Setting pods for source" source="api"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.884289 14147 config.go:383] "Receiving a new pod" pod="default/simple-game-server-psqqb-hnz28"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.887060 14147 volume_manager.go:394] "Waiting for volumes to attach and mount for pod" pod="kube-system/kube-proxy-gke-friday-developme-gameserver-pool--f788a1c2-pdj1"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.887324 14147 volume_manager.go:425] "All volumes are attached and mounted for pod" pod="kube-system/kube-proxy-gke-friday-developme-gameserver-pool--f788a1c2-pdj1"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.887937 14147 kuberuntime_manager.go:711] "computePodActions got for pod" podActions={KillPod:false CreateSandbox:false SandboxID:2d3ba683c2accef9b8e33e0d44ff5602267cb2b4645f5cce78d4f04ff6c20c2a Attempt:0 NextInitContainerToStart:nil ContainersToStart:[] ContainersToKill:map[] EphemeralContainersToStart:[]} pod="kube-system/kube-proxy-gke-friday-developme-gameserver-pool--f788a1c2-pdj1"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.888478 14147 kubelet.go:1934] "SyncLoop ADD" source="api" pods=[default/simple-game-server-psqqb-hnz28]
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.888680 14147 topology_manager.go:187] "Topology Admit Handler"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.889513 14147 predicate.go:143] "Predicate failed on Pod" pod="simple-game-server-psqqb-hnz28_default(34520514-3a31-4fc2-a039-5727211e7f4b)" err="Node didn't have enough resource: pods, requested: 1, used: 32, capacity: 32"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.890657 14147 event.go:291] "Event occurred" object="default/simple-game-server-psqqb-hnz28" kind="Pod" apiVersion="v1" type="Warning" reason="OutOfpods" message="Node didn't have enough resource: pods, requested: 1, used: 32, capacity: 32"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.922069 14147 status_manager.go:586] "Patch status for pod" pod="default/simple-game-server-psqqb-hnz28" patchBytes="{\"metadata\":{\"uid\":\"34520514-3a31-4fc2-a039-5727211e7f4b\"},\"status\":{\"conditions\":null,\"message\":\"Pod Node didn't have enough resource: pods, requested: 1, used: 32, capacity: 32\",\"phase\":\"Failed\",\"qosClass\":null,\"reason\":\"OutOfpods\",\"startTime\":\"2022-07-29T01:42:42Z\"}}"
Jul 29 01:42:42 gke-friday-developme-gameserver-pool--f788a1c2-pdj1 kubelet[14147]: I0729 01:42:42.922330 14147 status_manager.go:595] "Status for pod updated successfully" pod="default/simple-game-server-psqqb-hnz28" statusVersion=1 status={Phase:Failed Conditions:[] Message:Pod Node didn't have enough resource: pods, requested: 1, used: 32, capacity: 32 Reason:OutOfpods NominatedNodeName: HostIP: PodIP: PodIPs:[] StartTime:2022-07-29 01:42:42 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[] QOSClass: EphemeralContainerStatuses:[]}
Any Ideas to reproduce?
I have updated the reproduction step. I said I deployed a static pod in kube-system, but I actually deploy a bare pod, not a static pod. I created a static pod like this step, then the issue was reproduced without deleting kube-proxy.
Maybe a silly question but - why would someone add that manifest to a node? (most people I would expect are on cloud providers and either (a) don't have access or (b) will have it overwritten pretty fast)
According to the documentation, Static Pods are supposed to be used by users to deploy their own control plane components, but I'm not sure if there are other use cases where users actually use them. Just to be clear, I used Static Pods only to reproduce this issue, so I did not use it in a real environment.
We're currently workaround this issue by running descheduler to evict Pods failed with Outofpods. However, it delays for a certain time period, since descheduler running with cronjob.
So, I would like to submit a PR that solves this issue, but I don't feel I can put a reproduction method into a test case... π’
Today I encountered the same behavior as this issue when the Pod failed with OutOfcpu
.
It seems that when the Pod failed in some reasons, the GameServer does not recover automatically π€
@markmandel
I've been working on this issue and it seems that this issue is caused by the insufficient resource error in kubelet. https://github.com/kubernetes/kubernetes/blob/v1.21.12/pkg/kubelet/lifecycle/predicate.go#L140
My suggestion to fix this issue is that adding condition whether or not Pod is failed by insufficient resource error here. https://github.com/googleforgames/agones/blob/main/pkg/gameservers/health.go#L105
If it's ok to apply this fix, I'll submit a PR for this.
What do you think?
I'll be honest, I'm still not understanding what the actual issue here is. It seems like you have to break Kubernetes to make it happen -- which doesn't sound like an actual bug, it sounds like an extreme edge case.
Also, a Pod in pending state is an indicator to the cluster autoscaler that it should expand the cluster - so changing that behaviour says to me that we should leave this alone.
If you can't replicate this issue without actually messing around with the underlying Kubernetes system, I'm not sure we should be considering a fix here?
I too am aware that I have hit an edge case rather than found a bug. We are currently downsize the entire cluster after work hours, and we get it back to where it was before work hours. I feel this makes it easier to step on edge cases.
Also, a Pod in pending state is an indicator to the cluster autoscaler that it should expand the cluster - so changing that behaviour says to me that we should leave this alone.
There may be some misunderstanding in this part. The problem is not the Pod in pending state, but in failed state. The GameServer does not recreate the backing Pod in failed state, GameServer never transition its state from Scheduled to another, even if cluster autoscaler scale out nodes.
This may prevents FleetAutoScaler from working properly, since GameServer stuck in Scheduled state and FleetAutoScaler cannot scale GameServer until manually delete the Pod in failed state.
If you can't replicate this issue without actually messing around with the underlying Kubernetes system, I'm not sure we should be considering a fix here?
Maybe I should ask the sig-node community for help to replicate the issue without killing static pods.
I don't know if it's worth mentioning, but a Pod created from Deployment will not get stuck in this state and become Running.
So my thought is that it would be nice if Pods created by GameServer could be recovered in the same way.
Gotcha!
So ultimately it sounds like if a Pod status is Failed
- we should handle that general case (less of an issue with OutOfPods, but more generally if a Pod is in a Failed state).
I wonder if there is an easy way to just create a Failed Pod somehow, and use that as our test case. It does sound like if a Pod has failed for any reason, it should be moved to Unhealthy anyway.
I had a quick look to see if there was an easy way to make that happen though. Did you have any luck with sig-node?
Thank you for straightening out the issue.
I wonder if there is an easy way to just create a Failed Pod somehow, and use that as our test case. It does sound like if a Pod has failed for any reason, it should be moved to Unhealthy anyway. I had a quick look to see if there was an easy way to make that happen though. Did you have any luck with sig-node?
Sorry, I've been too busy to ask the community for help yet, but I will do so soon.
@markmandel
Sorry, It takes too long to ask question. I've posted the question to k8s community slack channel, but couldn't get an answer. I've also tried to create a failed pod, but I have no idea to change pod's status.phase to Failed.
Have you found any way to create a Failed Pod?
Looking at: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/
Failed: All containers in the Pod have terminated, and at least one container has terminated in failure. That is, the container either exited with non-zero status or was terminated by the system.
If this happens (assuming there is a Fleet in use), this will kick the GameServer into Unhealthy , and it will then get replaced with a new Pod.
...also, if we can't replicate the issue, is it an issue? π
@markmandel
I found a manifest to replicate the issue.
We have to set restartPolicy
to Never
, then exit containers with non-zero status.
To exit all containers with non-zero status, I have to add hostPID: true
to see all container process id.
I also found that exit containers with zero status will stuck on phase Succeeded
. Should we handle this case, too? π€
apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
name: simple-game-server
namespace: default
spec:
replicas: 1
template:
spec:
ports:
- name: default
containerPort: 7654
template:
spec:
hostPID: true
restartPolicy: Never
containers:
- name: simple-game-server
image: gcr.io/agones-images/simple-game-server:0.13
command:
- sh
- -cx
- |
pgrep -n sdk-server | xargs kill -9
exit 1
Related issue: https://github.com/googleforgames/agones/issues/2361
I would like to chime in here as it seems like it's the same issue.
There is a relatively fresh kubernetes feature - https://kubernetes.io/docs/concepts/architecture/nodes/#graceful-node-shutdown It seems like it can lead to pods being transitioned into Failed state too:
Status: Failed
Reason: Terminated
Message: Pod was terminated in response to imminent node shutdown.
I think controlling GameServer
should indeed be moved to unhealthy here as a correct reaction.
I don't have a concrete way to reproduce unfortunately, is we encountered an issue in production on a loaded cluster.
But I can guess that this will happen if nodes shutdownGracePeriod
and shutdownGracePeriodCriticalPods
are not 0 (to enable the feature) but not enough to actually terminate containers inside the pod gracefully, due to them having bigger terminationGracePeriodSeconds
and actually using it up.
This is all kinds of fun, because of Pod restarts π so I appreciate people digging in.
One thing I'm trying to work out from the documentation is, that if the Pod is in a Failed
state, is that the final state?
I.e. do we know if a Pod could restart its way out of a Failed
state? @roberthbailey @zmerlynn @gongmax @igooch any of you know? The docs don't seem clear to me.
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase
status.phase
is a synthetic view really meant for humans. As pointed out in the link: The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle. The phase is not intended to be a comprehensive rollup of observations of container or Pod state, nor is it intended to be a comprehensive state machine. Really, you can think of status.phase
as aggregating the state of each container and potentially other Pod conditions.
That said, with restartPolicy: Never
, I would expect Failed
to be terminal (except for possibly some nuance around the state of the SDK sidecar container). It would be useful to have the full kubectl get -oyaml
for the Pod in question rather than describe view, just to see.
Since we run (by default) with restartPolicy: Always
on the Pod, we have to assume there is a restart.
@unlightable in your situation, i assume once the node was torn down, the GameServer
was replaced on another node?
@unlightable in your situation, i assume once the node was torn down, the
GameServer
was replaced on another node?
Nope. Node is not teared down, merely rebooted. I'm guessing that removing it from the cluster completely could actually destroy the pod and "fix" everything, but can't confirm yet.
Pods do look like that though:
apiVersion: v1
kind: Pod
metadata:
annotations:
agones.dev/container: gameserver
agones.dev/ready-container-id: docker://6b40d60782e000a35405845e48d2daf842c634eff8a90c47171b8d7a114fe50d
agones.dev/sdk-version: 1.27.0
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
creationTimestamp: "2023-01-04T04:02:05Z"
labels:
agones.dev/gameserver: live-pgwc9-bjs2h
agones.dev/role: gameserver
app: DedicatedServer
name: live-pgwc9-bjs2h
namespace: live
ownerReferences:
- apiVersion: agones.dev/v1
blockOwnerDeletion: true
controller: true
kind: GameServer
name: live-pgwc9-bjs2h
uid: d5ef5d46-e4d1-4fb5-99ca-7b41c04b06ca
resourceVersion: "82992068"
uid: 47922441-67f0-490c-946d-897ba9546ba4
spec:
affinity:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
agones.dev/role: gameserver
topologyKey: kubernetes.io/hostname
weight: 100
containers:
- args:
- --grpc-port=9357
- --http-port=9358
env:
- name: GAMESERVER_NAME
value: live-pgwc9-bjs2h
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: FEATURE_GATES
value: CustomFasSyncInterval=true&Example=true&PlayerAllocationFilter=false&PlayerTracking=false&ResetMetricsOnDelete=false&SDKGracefulTermination=true&StateAllocationFilter=true
image: gcr.io/agones-images/agones-sdk:1.27.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 3
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
name: agones-gameserver-sidecar
resources:
requests:
cpu: 30m
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-n95dr
readOnly: true
- command:
- /app/DedicatedServer
- --null-con
- --enable-eac
- --initial-nice
- "10"
- +exec
- /var/xo-ds-config/config.cfg
env:
- name: AGONES_SDK_GRPC_PORT
value: "9357"
- name: AGONES_SDK_HTTP_PORT
value: "9358"
image: cr.yandex/crph7uvg1chcap6rvt9g/xo-gameserver:2.2.10.231067
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- /bin/bash
- /var/xo-ds-config/preStop.sh
livenessProbe:
failureThreshold: 3
httpGet:
path: /gshealthz
port: 8080
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 20
successThreshold: 1
timeoutSeconds: 1
name: gameserver
ports:
- containerPort: 35000
hostPort: 35063
protocol: UDP
resources:
limits:
cpu: "1"
memory: 800Mi
requests:
cpu: 400m
memory: 400Mi
securityContext:
capabilities:
add:
- SYS_NICE
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/xo-ds-config
name: xo-ds-config
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-n95dr
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: cl1drlo5quu5uo06e12s-onul
nodeSelector:
yandex.cloud/preemptible: "true"
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: xo-ds
serviceAccountName: xo-ds
terminationGracePeriodSeconds: 65
tolerations:
- key: preemptible
value: "true"
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- configMap:
defaultMode: 420
name: xo-ds-config
name: xo-ds-config
- name: kube-api-access-n95dr
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-01-04T04:02:05Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2023-01-04T04:02:07Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2023-01-04T04:02:07Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2023-01-04T04:02:05Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://5a0473a7366838d11cc2acce68b2c2865b80b9a595bc349c9d80ff59c9f5c591
image: gcr.io/agones-images/agones-sdk:1.27.0
imageID: docker-pullable://gcr.io/agones-images/agones-sdk@sha256:9e31ebde2abd1410a6e94dcd119b653070a162a27e8056601c5bbbb4f2b3e3e4
lastState: {}
name: agones-gameserver-sidecar
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-01-04T04:02:06Z"
- containerID: docker://6b40d60782e000a35405845e48d2daf842c634eff8a90c47171b8d7a114fe50d
image: cr.yandex/crph7uvg1chcap6rvt9g/xo-gameserver:2.2.10.231067
imageID: docker-pullable://cr.yandex/crph7uvg1chcap6rvt9g/xo-gameserver@sha256:386ec73f9a85fcb483ee3a6b8c8e5f16f0366af9894bce2eb7c7ba01e637f38b
lastState: {}
name: gameserver
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-01-04T04:02:07Z"
hostIP: 172.28.132.38
message: Pod was terminated in response to imminent node shutdown.
phase: Failed
podIP: 10.112.227.19
podIPs:
- ip: 10.112.227.19
qosClass: Burstable
reason: Terminated
startTime: "2023-01-04T04:02:05Z"
and associated GameServer
apiVersion: agones.dev/v1
kind: GameServer
metadata:
annotations:
agones.dev/last-allocated: "2023-01-04T04:02:17.689865428Z"
agones.dev/ready-container-id: docker://6b40d60782e000a35405845e48d2daf842c634eff8a90c47171b8d7a114fe50d
agones.dev/sdk-version: 1.27.0
creationTimestamp: "2023-01-04T04:02:05Z"
finalizers:
- agones.dev
generateName: live-pgwc9-
generation: 7
labels:
agones.dev/fleet: live
agones.dev/gameserverset: live-pgwc9
version: 2.2.10.231067
name: live-pgwc9-bjs2h
namespace: live
ownerReferences:
- apiVersion: agones.dev/v1
blockOwnerDeletion: true
controller: true
kind: GameServerSet
name: live-pgwc9
uid: 5947013d-9da6-44fd-bc65-e05c11901453
resourceVersion: "82974474"
uid: d5ef5d46-e4d1-4fb5-99ca-7b41c04b06ca
spec:
container: gameserver
health:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 20
ports:
- container: gameserver
containerPort: 35000
hostPort: 35063
name: default
portPolicy: Dynamic
protocol: UDP
scheduling: Packed
sdkServer:
grpcPort: 9357
httpPort: 9358
logLevel: Info
template:
metadata:
creationTimestamp: null
labels:
app: DedicatedServer
spec:
containers:
- command:
- /app/DedicatedServer
- --null-con
- --enable-eac
- --initial-nice
- "10"
- +exec
- /var/xo-ds-config/config.cfg
image: cr.yandex/crph7uvg1chcap6rvt9g/xo-gameserver:2.2.10.231067
lifecycle:
preStop:
exec:
command:
- /bin/bash
- /var/xo-ds-config/preStop.sh
name: gameserver
resources:
limits:
cpu: "1"
memory: 800Mi
requests:
cpu: 400m
memory: 400Mi
securityContext:
capabilities:
add:
- SYS_NICE
volumeMounts:
- mountPath: /var/xo-ds-config
name: xo-ds-config
nodeSelector:
yandex.cloud/preemptible: "true"
serviceAccountName: xo-ds
terminationGracePeriodSeconds: 65
tolerations:
- key: preemptible
value: "true"
volumes:
- configMap:
name: xo-ds-config
name: xo-ds-config
status:
address: 158.160.13.221
nodeName: cl1drlo5quu5uo06e12s-onul
players: null
ports:
- name: default
port: 35063
reservedUntil: null
state: Allocated
Node is not teared down, merely rebooted. I'm guessing that removing it from the cluster completely could actually destroy the pod and "fix" everything, but can't confirm yet.
Maybe a silly question - but how is a node rebooted without deleting the Pods on it?
Maybe a silly question - but how is a node rebooted without deleting the Pods on it?
Given Pod was terminated in response to imminent node shutdown.
, I would guess using something like the reboot
command?
No I mean, from the docs, it reads to me that Pods should all be shutdown as part of the reboot - so they should all go away on that node, taking the GameServers with them in the process.
So why is that not happening in this instance?
Maybe a silly question - but how is a node rebooted without deleting the Pods on it?
So why is that not happening in this instance?
I've assumed a few comments ago that it happens due to graceful shutdown: https://github.com/googleforgames/agones/issues/2683#issuecomment-1367438706
Assuming that reboot needs to happen respecting some deadline, pods that were in shutdown hooks have to be moved into that special state?
Here is the logic doing that (I think): https://github.com/kubernetes/kubernetes/blob/37e73b419e455db34f5fe3e8d815418680ab23df/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go#L377
I even found some related issue through looking into e2e test for the graceful shutdown feature: https://github.com/kubernetes/kubernetes/issues/108594
Assuming that reboot needs to happen respecting some deadline, pods that were in shutdown hooks have to be moved into that special state?
But doesn't the node eventually assume it can't gracefully shut everything down, do a force kill and delete all the Pods that way (and then reboot the node?)
I guess I'm leaning more towards "this is a bug in K8s" rather than an issue in Agones? π€
But doesn't the node eventually assume it can't gracefully shut everything down, do a force kill and delete all the Pods that way (and then reboot the node?)
It kinda does, but moves those pods into that failed state instead. Probably to indicate that there were some issue with shutdown? IDK.
I guess I'm leaning more towards "this is a bug in K8s" rather than an issue in Agones? π€
Well maybe? But even if it is, don't you think GameServer
should react somehow when controlled pod ends up is in some weird state indicating failure? It could be more logic/bugs in k8s that make it so, as we found out in this issue and related ones.
But doesn't the node eventually assume it can't gracefully shut everything down, do a force kill and delete all the Pods that way (and then reboot the node?)
It kinda does, but moves those pods into that failed state instead. Probably to indicate that there were some issue with shutdown? IDK.
I guess I'm leaning more towards "this is a bug in K8s" rather than an issue in Agones? thinking
Well maybe? But even if it is, don't you think
GameServer
should react somehow when controlled pod ends up is in some weird state indicating failure? It could be more logic/bugs in k8s that make it so, as we found out in this issue and related ones.
Oh I 100% hear you - but it can be super hard for us to actually know "hey this is a really bad state that is unrecoverable" vs "this is a transitive state that will restart itself and then go away" - we do loads of hacks for this already because of how we respond to GameServers
doing unhealthy things based on what state they are in (Ready
etc).
So just to 100% check, did the node ever actually reboot, and if so, what happens to the Pod? Or did this state actually block the Node from restarting entirely?
So just to 100% check, did the node ever actually reboot, and if so, what happens to the Pod? Or did this state actually block the Node from restarting entirely?
Node reboots, pods stay in Failed
state, GameServer
keeps being Allocated
.
No containers are alive afterwards (e.g. you can't kubectl exec ...
or kubectl log ...
).
We ended up rolling a job that reaps those pods eventually, but would be nice to not have to (:
Node reboots, pods stay in
Failed
state
Oh weird!
GameServer
keeps beingAllocated
. No containers are alive afterwards (e.g. you can'tkubectl exec ...
orkubectl log ...
).
Eep, yeah, that's bad. I noticed that the Pod doesn't get a deletionTimestamp
either, which.... sucks π¬
We ended up rolling a job that reaps those pods eventually, but would be nice to not have to (:
Yeah, that's fair enough. What criteria are you using specifically?
I wonder how Deployments and StatefulSets manage this π€ if there is something we can steal from there. I assume they don't have this issue?
Yeah, that's fair enough. What criteria are you using specifically?
We do no smart things and just look for Failed
pods with our fleet labels as no workload could correctly be in that state for us.
I wonder how Deployments and StatefulSets manage this π€ if there is something we can steal from there. I assume they don't have this issue?
I'll try to do some experiments after vacation, but afraid that no revelation awaits there. Deployments rely on pods count and readiness, so they will probably suffer from similar issue. Although there are some cases handled by https://github.com/kubernetes/kubernetes/blob/1d2e8042877c4facd3a45e911857f92474d64797/staging/src/k8s.io/api/extensions/v1beta1/types.go#L1000
StatefulSets even point out in the docs that sometimes human interaction is required to resolve the situation, so no holding breath here too.
Sod. Well there goes those good ideas.
π€ maybe another approach is better - something like "if the Pod has been in a failed state for <health periodSeconds * periodSeconds> we consider the whole thing defunct and then move it to Unhealthy.
Which makes a lot of sense really, since that's what the health check would likely be doing anyway.
How we work out it's Failed, and for how long is a different matter (event stream? maybe we track it ourselves?)
it's not an immediate move to Unhealthy
, but it does allow the system to eventually self heal. WDYT?
it's not an immediate move to Unhealthy, but it does allow the system to eventually self heal. WDYT?
I like it! As k8s authors themselves praise level-triggering over edges?
And Agones seems to echo those ways by requiring regular health pushes onto sidecar. So I would expect GameServer
to become unhealthy in some timeframe after those pushes begin to fail, be it due to actual game server crash or k8s deficiencies!
Excellent! Now we just need to work out how to track this π but I think we can do that oh yeah, and implement it π
You mean track liveliness through heartbeats? I would have designed it super straightforwardly - sidecar regularly pings/touches GameServer
and stores last ping timestamp.
The issue here is it creates noticeable write load on k8s object storage, as GameServer
changes frequently even while nothing happens?
We can move that load onto Agones controller by making it the recipient of said pings and tracking when they've stopped.
Another way to deal with it is just a periodic query for pods belonging to controlled fleets from Agones and judging their liveliness by that dreaded Failed
state. Which seems a bit more "special-case" rather than general solution.
You mean track liveliness through heartbeats? I would have designed it super straightforwardly - sidecar regularly pings/touches
GameServer
and stores last ping timestamp.
Ah - but there is no sidecar, because the Pod is failed π we can't guarantee the sidecar will be there.
I think we likely need to do this in the HealthController or create a new specific sub-controller for this.
Probably track pod updates, and look for failed state, and then cache it somewhere to be looked at again in <health periodSeconds * periodSeconds> to see if it's still the same.
@markmandel Tell me if it is possible to send the status to some matchmaking service when the game server crashes?
@markmandel Tell me if it is possible to send the status to some matchmaking service when the game server crashes?
You would need to check through the k8s API: https://agones.dev/site/docs/guides/access-api/
'This issue is marked as Stale due to inactivity for more than 30 days. To avoid being marked as 'stale' please add 'awaiting-maintainer' label or add a comment. Thank you for your contributions '
This issue is marked as obsolete due to inactivity for last 60 days. To avoid issue getting closed in next 30 days, please add a comment or add 'awaiting-maintainer' label. Thank you for your contributions
What happened:
Agones didn't create a new Pod when a Pod failed due to reasons OutOfpods, and the GameServer stuck with state Scheduled.
What you expected to happen:
GameServer is expected to create a new Pod if a Pod fails due to reasons of OutOfpods.
How to reproduce it (as minimally and precisely as possible):
/etc/kubernetes/manifests/static-pod.manifest
of the testing node.kubectl delete pod --force --grace-period=0 <static-pod-name> -n kube-system
All gameserver pods stuck with state Pending become failed with reason
OutOfpods
.Anything else we need to know?:
Here is the Pod status that I reproduce.
I created the Fleet from official document.
Environment:
kubectl version
): Client Version: v1.21.0 Server Version: v1.21.12-gke.1500