googleforgames / agones

Dedicated Game Server Hosting and Scaling for Multiplayer Games on Kubernetes
https://agones.dev
Apache License 2.0
6.04k stars 800 forks source link

Beta: UpdateList using Agones 1.41 rest API doesn't update the data in the selected list #3870

Closed Tsm012 closed 2 weeks ago

Tsm012 commented 3 months ago

What happened: Attempting to use the Lists features in Agones 1.41. I have a fleet deployed in agones and am attempting to use the:

curl -d '{"capacity": "120", "values": ["player3", "player4"]}' -H "Content-Type: application/json" -X PATCH http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

call as mentioned in the rest documentation. The list is declared on our "kind: Fleet" resource and declares a list players

lists:
  players:
    capacity: 20
    values:
    - player1
    - player2
    - player3

Once the fleet is deployed, I can run the command:

curl -H "Content-Type: application/json" -X GET http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

and will get the correct response.

{"name":"players","capacity":"20","values":["player1","player2","player3"]}

When I go to update that same list using the command:

curl -d '{"capacity": "20", "values": ["player3", "player4"]}' -H "Content-Type: application/json" -X PATCH http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

I get the response:

{"name":"","capacity":"0","values":[]}

I then check again to see if my list was updated:

curl -H "Content-Type: application/json" -X GET http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

but the reply seems to indicate nothing has changed.

{"name":"players","capacity":"20","values":["player1","player2","player3"]}

What you expected to happen:

I expected the list up update to reflect the data payload of

{"capacity": "20", "values": ["player3", "player4"]}

How to reproduce it (as minimally and precisely as possible):

Create a Fleet resource in agones and deploy it with a list declared as:

lists:
  players:
    capacity: 20
    values:
    - player1
    - player2
    - player3

shell into a game server in the fleet just deployed and verify that your initial values are in the list

curl -H "Content-Type: application/json" -X GET http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

{"name":"players","capacity":"20","values":["player1","player2","player3"]}

Attempt to update the list with new values

curl -d '{"capacity": "20", "values": ["player3", "player4"]}' -H "Content-Type: application/json" -X PATCH http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

after running the update list call, confirm that the list has not changed

curl -H "Content-Type: application/json" -X GET http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

{"name":"players","capacity":"20","values":["player1","player2","player3"]}

Anything else we need to know?:

Environment:

markmandel commented 3 months ago

Can you share the logs from the sdk-server sidecar?

If you have a gameserver or fleet.yaml you could share with a setup that easy to replicate this, would also be handy.

markmandel commented 3 months ago

FYI: @igooch just to make sure you see this.

Tsm012 commented 3 months ago

Here is a simplified version of a fleet resource that you can deploy. This is happening in both AWS EKS and locally in minikube. If you shell into the alpine container, you can run the curl commands to replicate.

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: simple-game-server-fleet
spec:
  replicas: 1
  template:
    spec:
      sdkServer:
        logLevel: Debug
      lists:
        players:
          capacity: 3
          values:
          - player1
          - player2
          - player3
      container: simple-game-server
      template:
        spec:
          containers:
            - name: simple-game-server
              image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.31
            - name: alpine   
              image: alpine/curl 
              command: ["sleep"]         # Override the container entrypoint with a simple sleep command
              args: ["1000000"]  

here is the tail from the agones sidecar, setting the log info to debug doesn't seem to add any additional info even though doing a describe on the pod seems to say the LOG_LEVEL environment variable is set to Debug

{"error":"not a valid logrus Level: \"\"","message":"Invalid LOG_LEVEL value. Defaulting to 'info'.","severity":"warning","time":"2024-06-13T23:37:24.349150707Z"}

{"ctlConf":{"GameServerName":"simple-game-server-fleet-kb5lx-lwd84","PodNamespace":"default","Address":"localhost","IsLocal":false,"LocalFile":"","Delay":0,"Timeout":0,"Test":"","TestSdkName":"","KubeConfig":"","GracefulTermination":true,"GRPCPort":9357,"HTTPPort":9358,"LogLevel":""},"featureGates":"AutopilotPassthroughPort=false\u0026CountsAndLists=true\u0026DisableResyncOnSDKServer=true\u0026Example=false\u0026GKEAutopilotExtendedDurationPods=false\u0026PlayerAllocationFilter=false\u0026PlayerTracking=false\u0026PortPolicyNone=false\u0026PortRanges=false\u0026RollingUpdateFix=false","message":"Starting sdk sidecar","severity":"info","source":"main","time":"2024-06-13T23:37:24.349276227Z","version":"1.41.0"}

{"gsKey":"default/simple-game-server-fleet-kb5lx-lwd84","message":"Created GameServer sidecar","severity":"info","source":"*sdkserver.SDKServer","time":"2024-06-13T23:37:24.350165213Z"}

{"gsKey":"default/simple-game-server-fleet-kb5lx-lwd84","message":"Connection to Kubernetes service established","severity":"info","source":"*sdkserver.SDKServer","time":"2024-06-13T23:37:24.359130873Z","try":0}             

{"grpcEndpoint":"localhost:9357","message":"Starting SDKServer grpc service...","severity":"info","source":"main","time":"2024-06-13T23:37:24.359724909Z"}

{"gsKey":"default/simple-game-server-fleet-kb5lx-lwd84","message":"Starting workers...","queue":"agones.dev.default.simple-game-server-fleet-kb5lx-lwd84","severity":"info","source":"*sdkserver.SDKServer","time":"2024-06-13T23:37:24.460172777Z","workers":1}

{"httpEndpoint":"localhost:9358","message":"Starting SDKServer grpc-gateway...","severity":"info","source":"main","time":"2024-06-13T23:37:25.361356656Z"}
igooch commented 3 months ago

Looks like I implemented the UpdateList to only update the capacity, since all the other SDKs only use it for changing the capacity. The List values can be changed by addValue or removeValue. @Tsm012 do the addValue and removeValue work for your use case, or do you need to overwrite the value list?

@markmandel we should update the documentation to reflect that as-is it only updates the capacity.

https://github.com/googleforgames/agones/blob/6bc20a138c30a7c16f5c88e847c5487b6506d674/pkg/sdkserver/sdkserver.go#L1056-L1094

Tsm012 commented 3 months ago

We could make it work using the add/remove value calls. The source of truth for this data is on the gameserver so one call to update all the data in the list based on data from the game server would be really nice.

markmandel commented 3 months ago

We could make it work using the add/remove value calls. The source of truth for this data is on the gameserver so one call to update all the data in the list based on data from the game server would be really nice.

I'm curious your use case for overwriting a total list? Usually we see people wanting to add/remove, but not overwrite. Do you have one in mind.

@igooch - 100% agree the docs need to be updated.

Tsm012 commented 3 months ago

Our game server maintains a list of data needed to connect to it that is used by our match maker. This data is constantly changing as players join and leave different servers and changes in the data can happen very frequently. This list of data is always authoritative. Instead of having to constantly check what has changed in the list maintained by the server and what was the last state of the list in Agones and adding and removing entries one at a time using multiple rest calls, It would be nice to be able to take a snapshot of what the current authoritative list is on the server and replace the list in Agones with one call.

markmandel commented 3 months ago

Our game server maintains a list of data needed to connect to it that is used by our match maker.

More for curiosities sake - is the matchmaker communicating this list somehow? or is the game server process maintaining this list within itself? or something else?

Tsm012 commented 3 months ago

is the matchmaker communicating this list somehow?

Yes, our match maker is using the agones sdk to pull list information for game servers to use in the match making process.

markmandel commented 3 months ago

Yes, our match maker is using the agones sdk

... okay now I'm really curious. The Agones SDK can only talk to a single GameServer - unless your match maker talks to the game server binary (somehow?) and the server does the work against it's own GameServer record through the SDK?

igooch commented 3 months ago

@markmandel I took a look at the docs, and they were tested against the LocalSDKServer which does overwrite both the capacity and the values. I'll make a separate issue for the documentation updates. (An aside that UpdateCounter is working as expected, although it also returns an empty object.)

Do we want to update the SDK Server to also overwrite the values?

appleturnoverload commented 3 months ago

Yes, our match maker is using the agones sdk

... okay now I'm really curious. The Agones SDK can only talk to a single GameServer - unless your match maker talks to the game server binary (somehow?) and the server does the work against it's own GameServer record through the SDK?

Just to clarify this, we are currently using the k8s API here https://agones.dev/site/docs/guides/access-api/ to get all the info from the game servers in one go rather than using the SDK itself to get data about individual game servers.

So the way it's working is that the server is responsible for maintaining some of the lists, which the matchmaking system reads from using the k8s API to make informed decisions (things like current players, player capacity). Then on allocate, the matchmaking system will add some values to another list to record the fact that players are expecting to join the game server and the server will receive this change by listening to changes through the SDK.

We are expecting to have a slightly less heavy handed way of getting server state into the matchmaking system by streaming changes through the k8s API rather than just getting all the game servers in bulk.

markmandel commented 3 months ago

Sounds like this is actually a K8s API thing you are doing, and not part of the SDK -- so we can close this issue now I believe.

chrisfoster121 commented 2 months ago

Hello, I have been following this and would like to try to update the UpdateList SDK code to support changing list values. I have some changes locally but am unable to follow the Minikube testing procedures. Specifically, I am getting stuck during the make minikube-shell step when running kubectl get pods. I am seeing the following output:

root@docker-desktop:/go/src/agones.dev/agones# kubectl get pods
E0705 23:20:59.463501      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
E0705 23:20:59.463762      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
E0705 23:20:59.465400      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
E0705 23:20:59.465761      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
E0705 23:20:59.467207      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
The connection to the server 127.0.0.1:52935 was refused - did you specify the right host or port?
root@docker-desktop:/go/src/agones.dev/agones#

It seems as though the docker container is starting in the Docker Desktop WSL distro that was created when installing docker desktop. Is this the expected behavior? Additionally, I noticed that I was not immediately switched to agones context when running make minikube-test-cluster.

My development environment is:

Should this be a separate issue? Am I missing something here?

igooch commented 2 months ago

@chrisfoster121 could you repost on the Slack channel to see if anyone there has run into this issue that could help?

chrisfoster121 commented 2 months ago

@igooch That link looks to have expired. It says "This link is no longer active". Additionally, I have been able to work around this by doing all of my work in a Ubuntu VM on Hyper-V. How would I be able to get my SDK code running as the Agones sidecar? I have been able to push and install the controller, extensions, and ping images but that does not seem to impact the sidecar that gets run.

chrisfoster121 commented 2 months ago

@Tsm012 @markmandel @igooch I believe I have a fix for this that I have tested. I am going to look at cleaning this up and see about getting this in (following the contribution guidelines). It would be incredible if we could get this into the next stable release coming next Tuesday.

igooch commented 2 months ago

Ah so it is, updating the link in #3896.

Assuming you have the Agones repo cloned locally and are making changes there, go into the agones/build directory and run make gen-sdk-grpc SDK_FOLDER=restapi or make gen-all-sdk-grpc. From there the running-a-test-minikube-cluster instructions on building, pushing and installing for minikube should work, although noted in that same readme it hasn't been testing on minikube on Windows.