googleforgames / agones

Dedicated Game Server Hosting and Scaling for Multiplayer Games on Kubernetes
https://agones.dev
Apache License 2.0
6.03k stars 798 forks source link

Failed installation from helm on 1.34 #3357

Closed nonokangwei closed 1 year ago

nonokangwei commented 1 year ago

What happened: {"error":"Could not listen on :8081: open /home/agones/certs/server.crt: no such file or directory","message":"could not start runner: *https.Server","severity":"fatal","source":"main","time":"2023-08-31T04:17:28.672025994Z"} What you expected to happen: the controller and extensions can started successfully from helm installation How to reproduce it (as minimally and precisely as possible): using the latest helm agones 1.34 charts: agones-1.34.0 Anything else we need to know?: n/a Environment:

ashutosji commented 1 year ago

Hi @nonokangwei, Correct me if i am wrong, you are using custom certificate https://agones.dev/site/docs/installation/install-agones/helm/#cert-manager. Generally, this error occurred when it's not able to mount the volume for cert. Could you please reiterate to the documentation it should work. Just a quick question, Are you doing a fresh installation? And how are you setting the values? Is it throgh the command or values.yaml? Also, you refer this issue https://github.com/googleforgames/agones/issues/3323

nonokangwei commented 1 year ago

@ashutosji This installation is from a fresh installation with helm + values.yaml file. This installation did not using custom certificate(all it generated automatically). the values.yaml is as below, it seems the generated cert(kubernetes secrets is mount in an wrong path), by edit the deployment template of extensions and controller by setting the certs mont path from /certs to /home/agones/certs (https://github.com/googleforgames/agones/blob/main/install/helm/agones/templates/controller.yaml#L193)

agones: featureGates: "" metrics: prometheusEnabled: true prometheusServiceDiscovery: true stackdriverEnabled: false stackdriverProjectID: "" stackdriverLabels: "" serviceMonitor: enabled: false interval: 30s rbacEnabled: true registerServiceAccounts: true registerWebhooks: true registerApiService: true crds: install: true cleanupOnDelete: true serviceaccount: allocator: name: agones-allocator annotations: {} controller: name: agones-controller annotations: {} sdk: name: agones-sdk annotations: {} createPriorityClass: true priorityClassName: agones-system cloudProduct: "auto" controller: &controllerValues resources: {}

requests:

  #   cpu: 1
  #   memory: 256Mi
nodeSelector:
  cloud.google.com/gke-nodepool: default-pool
annotations: {}
tolerations:
- key: "agones.dev/agones-system"
  operator: "Equal"
  value: "true"
  effect: "NoExecute"
affinity:
  nodeAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
      preference:
        matchExpressions:
          - key: agones.dev/agones-system
            operator: Exists
generateTLS: true
tlsCert: ""
tlsKey: ""
disableSecret: false
allocationApiService:
  annotations: {}
  disableCaBundle: false
validatingWebhook:
  annotations: {}
  disableCaBundle: false
mutatingWebhook:
  annotations: {}
  disableCaBundle: false
customCertSecretPath: {}
safeToEvict: false
persistentLogs: true
persistentLogsSizeLimitMB: 10000
logLevel: info
numWorkers: 100
apiServerQPS: 400
apiServerQPSBurst: 500
http:
  port: 8080
healthCheck:
  initialDelaySeconds: 3
  periodSeconds: 3
  failureThreshold: 3
  timeoutSeconds: 1
allocationBatchWaitTime: 500ms

extensions: <<: *controllerValues pdb: minAvailable: 1 replicas: 2 ping: install: true pdb: enabled: false minAvailable: 1 updateStrategy: {} resources: {}

requests:

  #   cpu: 1
  #   memory: 256Mi
nodeSelector:
  cloud.google.com/gke-nodepool: default-pool
annotations: {}
tolerations:
- key: "agones.dev/agones-system"
  operator: "Equal"
  value: "true"
  effect: "NoExecute"
affinity:
  nodeAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
      preference:
        matchExpressions:
          - key: agones.dev/agones-system
            operator: Exists
replicas: 2
http:
  expose: true
  response: ok
  port: 80
  serviceType: LoadBalancer
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  annotations: {}
udp:
  expose: true
  rateLimit: 20
  port: 50000
  serviceType: LoadBalancer
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  annotations: {}
healthCheck:
  initialDelaySeconds: 3
  periodSeconds: 3
  failureThreshold: 3
  timeoutSeconds: 1

allocator: install: true pdb: enabled: false minAvailable: 1 updateStrategy: {} apiServerQPS: 400 apiServerQPSBurst: 500 logLevel: info annotations: {} resources: {}

requests:

  #   cpu: 1
  #   memory: 256Mi
nodeSelector:
  cloud.google.com/gke-nodepool: default-pool
healthCheck:
  initialDelaySeconds: 3
  periodSeconds: 3
  failureThreshold: 3
  timeoutSeconds: 1
tolerations:
- key: "agones.dev/agones-system"
  operator: "Equal"
  value: "true"
  effect: "NoExecute"
affinity:
  nodeAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
      preference:
        matchExpressions:
          - key: agones.dev/agones-system
            operator: Exists
replicas: 3
service:
  name: agones-allocator
  serviceType: LoadBalancer
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  annotations: {}
  http:
    enabled: true
    port: 443
    portName: https
    targetPort: 8443
    nodePort: 0 # nodePort will be used if the serviceType is set to NodePort
  grpc:
    enabled: true
    port: 443
    portName: grpc
    targetPort: 8443
    nodePort: 0 # nodePort will be used if the serviceType is set to NodePort
serviceMetrics:
  name: agones-allocator-metrics-service
  annotations: {}
  http:
    enabled: true
    port: 8080
    portName: http
disableSecretCreation: false
generateTLS: true
tlsCert: ""
tlsKey: ""
generateClientTLS: true
clientCAs: {}
disableMTLS: true
disableTLS: true
remoteAllocationTimeout: 10s
totalRemoteAllocationTimeout: 30s
allocationBatchWaitTime: 500ms

image: registry: us-docker.pkg.dev/agones-images/release tag: 1.30.0 controller: name: agones-controller pullPolicy: IfNotPresent

extensions settings ignored unless SplitControllerAndExtensions feature gate is enabled

extensions:
  name: agones-extensions
  pullPolicy: IfNotPresent
sdk:
  name: agones-sdk
  cpuRequest: 30m
  cpuLimit: 0
  memoryRequest: 0
  memoryLimit: 0
  alwaysPull: false
ping:
  name: agones-ping
  pullPolicy: IfNotPresent
allocator:
  name: agones-allocator
  pullPolicy: IfNotPresent

gameservers: namespaces:

helm: installTests: false

ashutosji commented 1 year ago

Thanks for the brief info. I followed this https://agones.dev/site/docs/installation/install-agones/helm/#installing-the-chart and it's working for me and I am not able to replicate this issue. Do you mean changing /certs to /home/agones/certs worked for you?

ashutosji commented 1 year ago

Hi @nonokangwei, The problem is here:

registry: us-docker.pkg.dev/agones-images/release
tag: 1.30.0

You are forcing an older version of Agones controller image into newer release. Note: The default values for helm configuration can change between releases. So, I would request you change only required and needed values in the vlaues.yaml file.

markmandel commented 1 year ago

I'm going to close this as answered, as I believe @ashutosji got to the root of the issue. Let us know if you run into anything else.

nonokangwei commented 1 year ago

Hi @nonokangwei, The problem is here:

registry: us-docker.pkg.dev/agones-images/release
tag: 1.30.0

You are forcing an older version of Agones controller image into newer release. Note: The default values for helm configuration can change between releases. So, I would request you change only required and needed values in the vlaues.yaml file.

Thanks, yep, the values.yaml template i using is from version 1.30.