googleforgames / agones

Dedicated Game Server Hosting and Scaling for Multiplayer Games on Kubernetes
https://agones.dev
Apache License 2.0
6.05k stars 801 forks source link

In-place Agones Upgrades: Testing #3795

Open zmerlynn opened 5 months ago

zmerlynn commented 5 months ago

[!NOTE] Milestone of #3766, which we are seeking feedback on. We will move forward with pieces that seem non-contentious, though.

Testing

To vet In-place Agones Upgrades, we need a combination of unit testing (which are assumed in all PRs), and end-to-end (e2e) testing. The problem with our current e2es is that they assume a particular configuration and test against it. That's good for what they're testing, but we need a different style of test to test upgrades.

From past experience, the best way I have seen to test upgrades is to be doing something and upgrade the system in-place. I propose a system where we keep a cluster under fairly active load, and mutate configuration continuously. As a starting point, imagine three subsystems:

If we do this right, we could even set this up in a couple of different modes - one with a fast wanderer (e.g. 30 minutes) to make sure we cover the most possible configuration space), and one with a slow wanderer (e.g. a day, a week) to soak test more. The Producer/Monitor for each will look rather similar, just the rate of change will be different.

1804devs commented 5 months ago

Some guidance would be helpful, but I've started coding for the Wanderer component. I'm not sure if I'm heading in the right direction.

1804devs commented 5 months ago

// Wanderer component package main

import ( "bytes" "encoding/json" "log" "net/http" "time" )

type Config struct { // Define configuration struct }

func main() { // Define the Agones API endpoint agonesEndpoint := "http://agones-api.example.com/config"

// Implement logic to trigger configuration changes
for {
    // Generate random configuration changes
    config := generateRandomConfig()

    // Convert configuration to JSON
    configJSON, err := json.Marshal(config)
    if err != nil {
        log.Println("Error marshalling configuration:", err)
        continue
    }

    // Perform configuration change request to Agones API
    _, err = http.Post(agonesEndpoint, "application/json", bytes.NewBuffer(configJSON))
    if err != nil {
        log.Println("Error sending configuration request:", err)
        continue
    }

    log.Println("Configuration change successful:", config)

    // Sleep for a defined interval before next configuration change
    time.Sleep(30 * time.Minute)
}

}

func generateRandomConfig() Config { // Implement logic to generate random configuration changes return Config{} }

// Producer component package main

import ( "agones.dev/agones" "context" "log" "time" )

func main() { // Initialize Agones SDK client client, err := agones.NewClient() if err != nil { log.Fatal("Error initializing Agones client:", err) }

// Continuously scale Fleets and allocate GameServers
for {
    // Scale Fleets
    err := scaleFleets(client)
    if err != nil {
        log.Println("Error scaling Fleets:", err)
    }

    // Allocate GameServers
    err = allocateGameServers(client)
    if err != nil {
        log.Println("Error allocating GameServers:", err)
    }

    // Sleep for a defined interval before next action
    time.Sleep(1 * time.Minute)
}

}

func scaleFleets(client *agones.Client) error { // Implement logic to scale Fleets return nil }

func allocateGameServers(client *agones.Client) error { // Implement logic to allocate GameServers return nil }

// Monitor component package main

import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" "log" "net/http" )

func main() { // Register Prometheus metrics requestDuration := prometheus.NewSummaryVec( prometheus.SummaryOpts{ Name: "http_request_duration_seconds", Help: "HTTP request duration in seconds.", }, []string{"handler", "method"}, ) prometheus.MustRegister(requestDuration)

// Define HTTP handler to expose metrics
http.Handle("/metrics", promhttp.Handler())

// Start HTTP server to expose metrics
log.Fatal(http.ListenAndServe(":8080", nil))

}