googleforgames / agones

Dedicated Game Server Hosting and Scaling for Multiplayer Games on Kubernetes
https://agones.dev
Apache License 2.0
6.09k stars 812 forks source link

Invalid warnings when using multi-cluster allocation #2498

Closed tmokmss closed 1 year ago

tmokmss commented 2 years ago

What happened: When I use multi-cluster allocation feature in the below configuration, where Router cluster is just dispatching allocation request to DGS clusters 1,2, I'm getting invalid warnings from the agones-allocator pod on Router cluster. On each allocation, this kind of log appears.

Router ┬─> DGS cluster 1
       └─> DGS cluster 2

Printed logs are below. You can see warnings like gameserver.agones.dev \"dgs-fleet-nf97g-pszwg\" not found on every allocation.


{"error":null,"message":"allocation response is being sent","response":{"gameServerName":"dgs-fleet-nf97g-fn67k","ports":[{"name":"default","port":7605}],"address":"reducted.compute.amazonaws.com","nodeName":"ip-10-0-5-242.ap-northeast-1.compute.internal"},"severity":"info","source":"main","time":"2022-02-28T09:31:05.758738163Z"}
{"message":"allocation request received.","request":{"namespace":"default","multiClusterSetting":{"enabled":true}},"severity":"info","source":"main","time":"2022-02-28T09:31:11.705316761Z"}
{"endpoint":"reducted.elb.ap-northeast-1.amazonaws.com:443","gsaKey":"remote-allocation","message":"forwarding allocation request","request":{"namespace":"default","multiClusterSetting":{"policySelector":{}},"requiredGameServerSelector":{},"metaPatch":{},"metadata":{},"gameServerSelectors":[{}]},"severity":"debug","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:11.706135831Z"}
{"error":"gameserver.agones.dev \"dgs-fleet-nf97g-pszwg\" not found","message":"failed to get gameserver:dgs-fleet-nf97g-pszwg namespace:","severity":"warning","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:11.771535492Z"}
{"error":null,"message":"allocation response is being sent","response":{"gameServerName":"dgs-fleet-nf97g-pszwg","ports":[{"name":"default","port":7953}],"address":"reducted.compute.amazonaws.com","nodeName":"ip-10-0-5-242.ap-northeast-1.compute.internal"},"severity":"info","source":"main","time":"2022-02-28T09:31:11.7715759Z"}
{"message":"allocation request received.","request":{"namespace":"default","multiClusterSetting":{"enabled":true}},"severity":"info","source":"main","time":"2022-02-28T09:31:17.710855882Z"}
{"endpoint":"reducted.elb.ap-northeast-1.amazonaws.com:443","gsaKey":"remote-allocation","message":"forwarding allocation request","request":{"namespace":"default","multiClusterSetting":{"policySelector":{}},"requiredGameServerSelector":{},"metaPatch":{},"metadata":{},"gameServerSelectors":[{}]},"severity":"debug","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:17.711196102Z"}
{"error":"gameserver.agones.dev \"dgs-fleet-wpdmh-jmn27\" not found","message":"failed to get gameserver:dgs-fleet-wpdmh-jmn27 namespace:","severity":"warning","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:17.77880276Z"}
{"error":null,"message":"allocation response is being sent","response":{"gameServerName":"dgs-fleet-wpdmh-jmn27","ports":[{"name":"default","port":7056}],"address":"reducted.compute.amazonaws.com","nodeName":"ip-10-0-88-178.ap-northeast-1.compute.internal"},"severity":"info","source":"main","time":"2022-02-28T09:31:17.779472706Z"}
{"message":"allocation request received.","request":{"namespace":"default","multiClusterSetting":{"enabled":true}},"severity":"info","source":"main","time":"2022-02-28T09:31:28.705297459Z"}
{"endpoint":"reducted.elb.ap-northeast-1.amazonaws.com:443","gsaKey":"remote-allocation","message":"forwarding allocation request","request":{"namespace":"default","multiClusterSetting":{"policySelector":{}},"requiredGameServerSelector":{},"metaPatch":{},"metadata":{},"gameServerSelectors":[{}]},"severity":"debug","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:28.705618768Z"}
{"error":"gameserver.agones.dev \"dgs-fleet-wpdmh-7f6z8\" not found","message":"failed to get gameserver:dgs-fleet-wpdmh-7f6z8 namespace:","severity":"warning","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:28.797657118Z"}

It seems Router cluster is trying to get information about gameserver on DGS clusters, resulting in not found errors.

What you expected to happen: No warnings shown.

How to reproduce it (as minimally and precisely as possible): Don't know specifically how to repoduce this. At least I'm just using multi-cluster allocation in the way written in docs and warnings are always printed.

Anything else we need to know?:

Environment:

roberthbailey commented 2 years ago

As an aside, your solution looks similar to the example @pooneh-m just sent a PR for (https://github.com/googleforgames/agones/pull/2499) except that you are using an Agones cluster w/ MCA for routing the requests instead of using cloud run.

roberthbailey commented 2 years ago

That warning is coming from this line. It looks like after a successful allocation, the allocator service is trying to look up the game server in the local cluster and failing to find it (because it doesn't exist in the local cluster).

It feels to me like this lookup should be skipped if we know that the allocation was from a remote cluster. The function applyMultiClusterAllocation doesn't currently return whether the allocation was done locally or not so the code site which calls setResponse for the metrics can't hint at whether the game server should exist in the local cluster.

The good news is that this warning is benign. But it will take a bit of refactoring in the code to be able to skip the lookup when it shouldn't be done.