googleforgames / agones

Dedicated Game Server Hosting and Scaling for Multiplayer Games on Kubernetes
https://agones.dev
Apache License 2.0
6.09k stars 812 forks source link

Add GameServer state duration metric #1013

Closed cyriltovena closed 4 years ago

cyriltovena commented 5 years ago

Is your feature request related to a problem? Please describe.

Being able to know how long a GameServer stays in a specific state can be very useful when troubleshooting Agones Performance. This could allow users to know how long a GameServer stays unallocated or even how long it takes to get it ready. We could also figure if game sessions are running correctly based on the average session time.

Describe the solution you'd like We could watch for state changes via the Kubernetes Event API and record for each how long the last state was set for.

Describe alternatives you've considered We could also simply adds a new timestamp property to the GameServer CRD and record duration when the state change.

This might rely on the watch API of the GameServer CRD and Events. We should verify how using the watch api affect the duration.

Additional context https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.10/#-strong-read-operations-strong--289

aLekSer commented 4 years ago

Relates to (duplicates) https://github.com/googleforgames/agones/issues/831

aLekSer commented 4 years ago

I have a working solution without adding timestamps into GS Status for this with injecting the code only on State change. So we can keep track of all GameServers last state change timestamps in a map or similar structure (could add retention period or LRU cache) and record previous state duration metric at that moment. See the branch with a solution: https://github.com/aLekSer/agones/tree/metrics/state-duration Need to add proper Graphana Chart into Dashboard configs.

aLekSer commented 4 years ago

@markmandel Do you think we could close this issue as well?

markmandel commented 4 years ago

I think so! If there are new requirements, let's open a new issue. :raised_hands: