percona / percona-xtradb-cluster-operator

Percona Operator for MySQL based on Percona XtraDB Cluster
https://www.percona.com/doc/kubernetes-operator-for-pxc/index.html
Apache License 2.0
509 stars 187 forks source link

Percona HAProxy failing to start due to missing kubernetes resolve #1763

Closed BoatPartyJesus closed 1 month ago

BoatPartyJesus commented 1 month ago

Report

After restarting HAProxy when applying the backup section to the cluster configuration it fails to start due to a missing resolver:

│ pxc-monit [ALERT]    (56) : config : backend 'galera-nodes', server 'percona-cluster-pxc-0': unable to find required resolvers 'kubernetes'                                                                                                                                            │
│ pxc-monit [ALERT]    (56) : config : backend 'galera-nodes', server 'percona-cluster-pxc-2': unable to find required resolvers 'kubernetes'                                                                                                                                            │
│ pxc-monit [ALERT]    (56) : config : backend 'galera-nodes', server 'percona-cluster-pxc-1': unable to find required resolvers 'kubernetes' 

More about the problem

haproxy-0:haproxy logs

+ '[' haproxy = haproxy ']'
+ '[' '!' -f /etc/haproxy/pxc/haproxy.cfg ']'
+ cp /etc/haproxy/haproxy.cfg /etc/haproxy/pxc
+ custom_conf=/etc/haproxy-custom/haproxy-global.cfg
+ '[' -f /etc/haproxy-custom/haproxy-global.cfg ']'
+ haproxy_opt='-W -db '
+ '[' -f /etc/haproxy-custom/haproxy-global.cfg -a -z '' ']'
+ haproxy_opt+='-f /etc/haproxy/haproxy-global.cfg '
+ haproxy_opt+='-f /etc/haproxy/pxc/haproxy.cfg -p /etc/haproxy/pxc/haproxy.pid -S /etc/haproxy/pxc/haproxy-main.sock '
+ test -e /opt/percona/hookscript/hook.sh
+ exec haproxy -W -db -f /etc/haproxy/haproxy-global.cfg -f /etc/haproxy/pxc/haproxy.cfg -p /etc/haproxy/pxc/haproxy.pid -S /etc/haproxy/pxc/haproxy-main.sock
[NOTICE]   (1) : New worker (9) forked
[NOTICE]   (1) : Loading success.

cluster-haproxy-0:haproxy-init logs

++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /haproxy_check_pxc.sh /opt/percona/haproxy_check_pxc.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /haproxy_add_pxc_nodes.sh /opt/percona/haproxy_add_pxc_nodes.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /haproxy_readiness_check.sh /opt/percona/haproxy_readiness_check.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /haproxy_liveness_check.sh /opt/percona/haproxy_liveness_check.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /haproxy.cfg /opt/percona/haproxy.cfg
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /haproxy-global.cfg /opt/percona/haproxy-global.cfg
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /peer-list /opt/percona/peer-list
Stream closed EOF for platform/percona-cluster-haproxy-0 (haproxy-init)

cluster-haproxy-0:pxc-init logs

++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /pxc-entrypoint.sh /var/lib/mysql/pxc-entrypoint.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /unsafe-bootstrap.sh /var/lib/mysql/unsafe-bootstrap.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /pxc-configure-pxc.sh /var/lib/mysql/pxc-configure-pxc.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /liveness-check.sh /var/lib/mysql/liveness-check.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /readiness-check.sh /var/lib/mysql/readiness-check.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /peer-list /var/lib/mysql/peer-list
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /get-pxc-state /var/lib/mysql/get-pxc-state
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /pmm-prerun.sh /var/lib/mysql/pmm-prerun.sh
Stream closed EOF for platform/percona-cluster-haproxy-0 (pxc-init)

cluster-haproxy-0:pxc-monit logs

+ '[' /opt/percona/peer-list = haproxy ']'
+ test -e /opt/percona/hookscript/hook.sh
+ exec /opt/percona/peer-list -on-change=/opt/percona/haproxy_add_pxc_nodes.sh -service=percona-cluster-pxc
2024/07/18 10:53:17 Peer finder enter
2024/07/18 10:53:17 Determined Domain to be platform.svc.cluster.local
2024/07/18 10:53:17 No on-start supplied, on-change /opt/percona/haproxy_add_pxc_nodes.sh will be applied on start.
2024/07/18 10:53:17 Peer list updated
was ]
now [percona-cluster-pxc-0.percona-cluster-pxc.platform.svc.cluster.local percona-cluster-pxc-1.percona-cluster-pxc.platform.svc.cluster.local percona-cluster-pxc-2.percona-cluster-pxc.platform.svc.cluster.local]
2024/07/18 10:53:17 execing: /opt/percona/haproxy_add_pxc_nodes.sh with stdin: percona-cluster-pxc-0.percona-cluster-pxc.platform.svc.cluster.local
percona-cluster-pxc-1.percona-cluster-pxc.platform.svc.cluster.local
percona-cluster-pxc-2.percona-cluster-pxc.platform.svc.cluster.local
2024/07/18 10:53:17 Failed to execute /opt/percona/haproxy_add_pxc_nodes.sh: {"time":"18/Jul/2024:10:53:17.009", "message": "Running /opt/percona/haproxy_add_pxc_nodes.sh"}
{"time":"18/Jul/2024:10:53:17.043", "message": "number of available nodes are 3"}
[NOTICE]   (56) : haproxy version is 2.6.12
[NOTICE]   (56) : path to executable is /usr/sbin/haproxy
[ALERT]    (56) : config : backend 'galera-nodes', server 'percona-cluster-pxc-0': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-nodes', server 'percona-cluster-pxc-2': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-nodes', server 'percona-cluster-pxc-1': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-admin-nodes', server 'percona-cluster-pxc-0': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-admin-nodes', server 'percona-cluster-pxc-2': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-admin-nodes', server 'percona-cluster-pxc-1': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-replica-nodes', server 'percona-cluster-pxc-0': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-replica-nodes', server 'percona-cluster-pxc-1': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-replica-nodes', server 'percona-cluster-pxc-2': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-mysqlx-nodes', server 'percona-cluster-pxc-0': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-mysqlx-nodes', server 'percona-cluster-pxc-2': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : backend 'galera-mysqlx-nodes', server 'percona-cluster-pxc-1': unable to find required resolvers 'kubernetes'
[ALERT]    (56) : config : Fatal errors found in configuration.
, err: exit status 1
Stream closed EOF for platform/percona-cluster-haproxy-0 (pxc-monit)

cluster-haproxy describe

Name:               percona-cluster-haproxy
Namespace:          platform
CreationTimestamp:  Thu, 18 Jul 2024 11:27:51 +0100
Selector:           app.kubernetes.io/component=haproxy,app.kubernetes.io/instance=percona-cluster,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster
Labels:             <none>
Replicas:           3 desired | 1 total
Update Strategy:    RollingUpdate
  Partition:        0
Pods Status:        1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app.kubernetes.io/component=haproxy
                    app.kubernetes.io/instance=percona-cluster
                    app.kubernetes.io/managed-by=percona-xtradb-cluster-operator
                    app.kubernetes.io/name=percona-xtradb-cluster
                    app.kubernetes.io/part-of=percona-xtradb-cluster
  Annotations:      percona.com/configuration-hash: d41d8cd98f00b204e9800998ecf8427e
  Service Account:  default
  Init Containers:
   pxc-init:
    Image:      perconalab/percona-xtradb-cluster-operator:main
    Port:       <none>
    Host Port:  <none>
    Command:
      /pxc-init-entrypoint.sh
    Limits:
      cpu:        50m
      memory:     50M
    Environment:  <none>
    Mounts:
      /var/lib/mysql from bin (rw)
   haproxy-init:
    Image:      perconalab/percona-xtradb-cluster-operator:main
    Port:       <none>
    Host Port:  <none>
    Command:
      /haproxy-init-entrypoint.sh
    Environment:  <none>
    Mounts:
      /opt/percona from bin (rw)
  Containers:
   haproxy:
    Image:       percona/percona-xtradb-cluster-operator:1.13.0-haproxy
    Ports:       3306/TCP, 3307/TCP, 3309/TCP, 33062/TCP, 33060/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
    Liveness:    exec [/opt/percona/haproxy_liveness_check.sh] delay=60s timeout=5s period=30s #success=1 #failure=4
    Readiness:   exec [/opt/percona/haproxy_readiness_check.sh] delay=15s timeout=1s period=5s #success=1 #failure=3
    Environment Variables from:
      percona-cluster-env-vars-haproxy  Secret  Optional: true
    Environment:
      PXC_SERVICE:              percona-cluster-pxc
      LIVENESS_CHECK_TIMEOUT:   5
      READINESS_CHECK_TIMEOUT:  1
    Mounts:
      /etc/haproxy-custom/ from haproxy-custom (rw)
      /etc/haproxy/pxc from haproxy-auto (rw)
      /etc/mysql/haproxy-env-secret from percona-cluster-env-vars-haproxy (rw)
      /etc/mysql/mysql-users-secret from mysql-users-secret-file (rw)
      /opt/percona from bin (rw)
   pxc-monit:
    Image:      percona/percona-xtradb-cluster-operator:1.13.0-haproxy
    Port:       <none>
    Host Port:  <none>
    Args:
      /opt/percona/peer-list
      -on-change=/opt/percona/haproxy_add_pxc_nodes.sh
      -service=$(PXC_SERVICE)
    Environment Variables from:
      percona-cluster-env-vars-haproxy  Secret  Optional: true
    Environment:
      PXC_SERVICE:                percona-cluster-pxc
      REPLICAS_SVC_ONLY_READERS:  false
    Mounts:
      /etc/haproxy-custom/ from haproxy-custom (rw)
      /etc/haproxy/pxc from haproxy-auto (rw)
      /etc/mysql/haproxy-env-secret from percona-cluster-env-vars-haproxy (rw)
      /etc/mysql/mysql-users-secret from mysql-users-secret-file (rw)
      /opt/percona from bin (rw)
  Volumes:
   haproxy-custom:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      percona-cluster-haproxy
    Optional:  true
   haproxy-auto:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
   mysql-users-secret-file:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  internal-percona-cluster
    Optional:    false
   percona-cluster-env-vars-haproxy:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  percona-cluster-env-vars-haproxy
    Optional:    true
   bin:
    Type:          EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:        
    SizeLimit:     <unset>
  Node-Selectors:  <none>
  Tolerations:     <none>
Volume Claims:     <none>
Events:
  Type     Reason               Age                   From                    Message
  ----     ------               ----                  ----                    -------
  Normal   SuccessfulCreate     14m (x2 over 26m)     statefulset-controller  create Pod percona-cluster-haproxy-0 in StatefulSet percona-cluster-haproxy successful
  Warning  FailedDelete         14m                   statefulset-controller  delete Pod percona-cluster-haproxy-0 in StatefulSet percona-cluster-haproxy failed error: pods "percona-cluster-haproxy-0" not found
  Normal   SuccessfulDelete     7m54s (x11 over 14m)  statefulset-controller  delete Pod percona-cluster-haproxy-0 in StatefulSet percona-cluster-haproxy successful
  Warning  RecreatingFailedPod  3m48s (x16 over 14m)  statefulset-controller  StatefulSet platform/percona-cluster-haproxy is recreating failed Pod percona-cluster-haproxy-0

cluster-opeartor describe

Name:                   percona-xtradb-cluster-operator
Namespace:              pxc-operator
CreationTimestamp:      Thu, 11 Apr 2024 00:36:32 +0100
Labels:                 <none>
Annotations:            deployment.kubernetes.io/revision: 9
Selector:               app.kubernetes.io/component=operator,app.kubernetes.io/instance=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster-operator,app.kubernetes.io/part-of=percona-xtradb-cluster-operator
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  1 max unavailable, 25% max surge
Pod Template:
  Labels:           app.kubernetes.io/component=operator
                    app.kubernetes.io/instance=percona-xtradb-cluster-operator
                    app.kubernetes.io/name=percona-xtradb-cluster-operator
                    app.kubernetes.io/part-of=percona-xtradb-cluster-operator
  Annotations:      kubectl.kubernetes.io/restartedAt: 2024-07-18T10:42:50+01:00
  Service Account:  percona-xtradb-cluster-operator
  Containers:
   percona-xtradb-cluster-operator:
    Image:      perconalab/percona-xtradb-cluster-operator:main
    Port:       8080/TCP
    Host Port:  0/TCP
    Command:
      percona-xtradb-cluster-operator
    Limits:
      cpu:     200m
      memory:  500Mi
    Requests:
      cpu:     100m
      memory:  20Mi
    Liveness:  http-get http://:metrics/metrics delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      LOG_STRUCTURED:     false
      LOG_LEVEL:          INFO
      WATCH_NAMESPACE:    
      POD_NAME:            (v1:metadata.name)
      OPERATOR_NAME:      percona-xtradb-cluster-operator
      DISABLE_TELEMETRY:  false
    Mounts:               <none>
  Volumes:                <none>
  Node-Selectors:         <none>
  Tolerations:            <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  percona-xtradb-cluster-operator-776f5b4d58 (0/0 replicas created), percona-xtradb-cluster-operator-57b596fd67 (0/0 replicas created), percona-xtradb-cluster-operator-9cd45d5 (0/0 replicas created), percona-xtradb-cluster-operator-5764d5966f (0/0 replicas created), percona-xtradb-cluster-operator-5fd7cb9b68 (0/0 replicas created), percona-xtradb-cluster-operator-cd64f5997 (0/0 replicas created), percona-xtradb-cluster-operator-7fbb8cdbfc (0/0 replicas created), percona-xtradb-cluster-operator-5f7c55df66 (0/0 replicas created)
NewReplicaSet:   percona-xtradb-cluster-operator-589cbddf8d (1/1 replicas created)
Events:          <none>

Steps to reproduce

  1. create cluster
    apiVersion: pxc.percona.com/v1
    kind: PerconaXtraDBCluster
    metadata:
    creationTimestamp: "2024-04-10T23:38:25Z"
    generation: 17
    labels:
    app.kubernetes.io/instance: uat
    k8slens-edit-resource-version: v1
    name: percona-cluster
    namespace: platform
    spec:
    backup:
    image: perconalab/percona-xtradb-cluster-operator:main-pxc5.7-backup
    storages:
      s3-backup:
        s3:
          bucket: backups/database
          credentialsSecret: cluster-backup-credenetials
          region: us-east-1
        type: s3
        verifyTLS: true
    crVersion: 1.15.0
    haproxy:
    enabled: true
    image: percona/percona-xtradb-cluster-operator:1.13.0-haproxy
    size: 3
    pause: false
    pxc:
    autoRecovery: true
    image: percona/percona-xtradb-cluster:5.7.44
    size: 3
    volumeSpec:
      persistentVolumeClaim:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
    updateStrategy: SmartUpdate
    upgradeOptions:
    apply: disabled
    schedule: 0 10 * * MON
    versionServiceEndpoint: https://check.percona.com
  2. Observe HAProxy does not start

Versions

  1. Kubernetes: v1.28.0 - talos
  2. Operator: v1.14 - perconalab/percona-xtradb-cluster-operator:main
  3. Database: percona/percona-xtradb-cluster:5.7.44
  4. Haproxy: percona/percona-xtradb-cluster-operator:1.13.0-haproxy

Anything else?

No response

hors commented 1 month ago

@BoatPartyJesus, I noticed you are using crVersion: 1.15.0 but the HAProxy image from 1.13.0 (percona/percona-xtradb-cluster-operator:1.13.0-haproxy). In this case, you should use perconalab/percona-xtradb-cluster-operator:main-haproxy.

P.S. Version 1.15.0 has not been released yet. It is under development right now.

BoatPartyJesus commented 1 month ago

Thank you for your help. I'm all sorted now!