googleforgames / agones

Dedicated Game Server Hosting and Scaling for Multiplayer Games on Kubernetes
https://agones.dev
Apache License 2.0
6.09k stars 812 forks source link

Mutation/Validation Webhooks failing when running Agones in a cluster with etcd v3.2.24 #949

Closed lftakakura closed 5 years ago

lftakakura commented 5 years ago

What happened: Agones resources (fleet, gameserverset, fleetautoscaler) not working properly (fleet state not updating with autoscaler or even manually) due to timeouts between kube api-server and agones-controller.

What you expected to happen: Gameservers should be created when increasing fleetautoscaler buffer size, for example. Resource edition should not get stuck.

How to reproduce it (as minimally and precisely as possible): Install Agones with kubernetes v1.11.10 and etcd v3.2.24. Create a gameserverset or fleet resource. Try incrementing the number of replicas of this resource a few times (it works the first time, but not the others)

Anything else we need to know?:

Environment:

kube-apiserver: "GuaranteedUpdate etcd3: *unstructured.Unstructured" (started: 2019-07-25 18:06:30.662774469 +0000 UTC m=+179134.367163462) (total time: 30.001933868s)

markmandel commented 5 years ago

Is this a Agones bug or a Kubernetes bug?

What are the logs from the controller?

markmandel commented 5 years ago

Digging into this more, this seems like a K8s/etcd issue, rather than an Agones issue.

Looking at: https://v1-11.docs.kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#notes-for-etcd-version-2-2-1

I see:

As of Kubernetes version 1.5.1, we are still using etcd from the 2.2.1 release with the v2 API. Also, we have no pre-existing process for updating etcd, as we have never updated etcd by either minor or major version.

Looks like there is an open bug to do a more comprehensive compatibility matrix, but it is not yet complete.

So I'm going to mark this as wontfix for now, and will close it in a few days -- unless there is strong objections?

But I would recommend this to be reported to the Kubernetes team - but I wonder if they will claim that's not a supported version of etcd 🤷‍♂

markmandel commented 5 years ago

Looks like no strong objections. Closing the issue. Please let us know if there are changes we can make on the Agones side to make this easier :+1: