colinmollenhour / mariadb-galera-swarm

MariaDb Galera Cluster container based on official mariadb image which can auto-bootstrap and recover cluster state.
https://hub.docker.com/r/colinmollenhour/mariadb-galera-swarm
Apache License 2.0
216 stars 101 forks source link

Kubernetes socat wrong number of parameters #72

Closed davidschrooten closed 2 years ago

davidschrooten commented 5 years ago
2019/05/08 11:46:21 socat[1546] E getaddrinfo("--wsrep-on=ON", "NULL", {1,0,1,6}, {}): Name or service not known
2019/05/08 11:46:21 socat[1547] E getaddrinfo("--wsrep-sst-method=mariabackup", "NULL", {1,0,1,6}, {}): Name or service not known
2019/05/08 11:46:21 socat[1548] E getaddrinfo("--wsrep-sst-method=mariabackup", "NULL", {1,0,1,6}, {}): Name or service not known
2019/05/08 11:46:21 socat[1549] E getaddrinfo("--wsrep_cluster_name=galera-testing", "NULL", {1,0,1,6}, {}): Name or service not known
2019/05/08 11:46:21 socat[1550] E getaddrinfo("--wsrep_cluster_name=galera-testing", "NULL", {1,0,1,6}, {}): Name or service not known
2019/05/08 11:46:21 socat[1551] E TCP: wrong number of parameters (3 instead of 2)
2019/05/08 11:46:21 socat[1552] E TCP: wrong number of parameters (3 instead of 2)
2019/05/08 11:46:21 socat[1553] E TCP: wrong number of parameters (3 instead of 2)
2019/05/08 11:46:21 socat[1554] E TCP: wrong number of parameters (3 instead of 2)
2019/05/08 11:46:21 socat[1555] E TCP: wrong number of parameters (3 instead of 2)
2019/05/08 11:46:21 socat[1556] E TCP: wrong number of parameters (3 instead of 2)

Hello,

I am currently creating a statefullset of this image for usage on kubernetes. However I am stuck with above error. If you're willing to help out I can share a working configuration.

My configuration is as followed: job.yml

kind: Job
metadata:
  name: mariadb-galera-seed
  namespace: {{ app_namespace }}
spec:
  template:
    metadata:
      labels:
        app: mariadb-galera-seed
    spec:
      volumes:
        - name: galera-secrets
          secret:
            secretName: mariadb-galera
      containers:
        - name: '{{app_name}}'
          image: colinmollenhour/mariadb-galera-swarm
          args:
            - seed
          ports:
            - containerPort: 3306
              name: mysql
            - containerPort: 3309
              name: recovery
            - containerPort: 4444
              name: sst
            - containerPort: 4567
              name: gcom
            - containerPort: 4568
              name: gcom2
            - containerPort: 8080
              name: hup
            - containerPort: 8081
              name: hboot
          env:
            - name: CLUSTER_NAME
              value: "galera-testing"
            - name: DEFAULT_TIME_ZONE
              value: "+01:00"
            - name: NODE_ADDRESS
              value: "^10.233.*.*"
            - name: MYSQL_ROOT_HOST
              value: "%"
            - name: MYSQL_ROOT_SOCKET_AUTH
              value: "0"
            - name: MYSQL_DATABASE
              value: portal
            - name: MYSQL_PASSWORD_FILE
              value: /etc/secrets/MYSQL_PASSWORD
            - name: MYSQL_ROOT_PASSWORD_FILE
              value: /etc/secrets/MYSQL_ROOT_PASSWORD
            - name: MYSQL_USER
              value: user
            - name: XTRABACKUP_PASSWORD_FILE
              value: /etc/secrets/XTRABACKUP_PASSWORD
            - name: SYSTEM_PASSWORD_FILE
              value: /etc/secrets/SYSTEM_PASSWORD
          volumeMounts:
            - name: galera-secrets
              mountPath: "/etc/secrets"
              readOnly: true
      restartPolicy: Never
  backofffLimit: 1

statefulset.yml

metadata:
  name: mariadb-galera
  namespace: {{ app_namespace }}
spec:
  serviceName: "mariadb-galera"
  replicas: 3
  selector:
    matchLabels:
      app: mariadb-galera-node
  template:
    metadata:
      labels:
        app: mariadb-galera-node
    spec:
      volumes:
        - name: galera-secrets
          secret:
            secretName: mariadb-galera
      containers:
      - name: mariadb-galera
        image: colinmollenhour/mariadb-galera-swarm
        env:
          - name: CLUSTER_NAME
            value: "galera-testing"
          - name: DEFAULT_TIME_ZONE
            value: "+01:00"
          - name: NODE_ADDRESS
            value: "^10.233.*.*"
          - name: MYSQL_ROOT_SOCKET_AUTH
            value: "0"
          - name: MYSQL_PASSWORD_FILE
            value: /etc/secrets/MYSQL_PASSWORD
          - name: MYSQL_ROOT_PASSWORD_FILE
            value: /etc/secrets/MYSQL_ROOT_PASSWORD
          - name: MYSQL_USER
            value: user
          - name: XTRABACKUP_PASSWORD_FILE
            value: /etc/secrets/XTRABACKUP_PASSWORD
          - name: SYSTEM_PASSWORD_FILE
            value: /etc/secrets/SYSTEM_PASSWORD
        command: [ "/bin/bash", "-c", "--" ]
        # delete lost+found on dynamic provisioned openebs volumes
        args: [ "rm -rf /var/lib/mysql/lost+found; touch /var/lib/mysql/force-cluster-bootstrapping; /usr/local/bin/start.sh node seed,node" ]
        ports:
        - containerPort: 3306
          name: mysql
        - containerPort: 3309
          name: recovery
        - containerPort: 4444
          name: sst
        - containerPort: 4567
          name: gcom
        - containerPort: 4568
          name: gcom2
        - containerPort: 8080
          name: hup
        - containerPort: 8081
          name: hboot
#        readinessProbe:
#          tcpSocket:
#            port: 7000
#          initialDelaySeconds: 15
#          timeoutSeconds: 5
#          successThreshold: 2
#        livenessProbe:
#          tcpSocket:
#            port: 6000
#          initialDelaySeconds: 120
#          periodSeconds: 15
        volumeMounts:
        - name: datadir
          mountPath: /var/lib/mysql
        - name: galera-secrets
          mountPath: "/etc/secrets"
          readOnly: true
  volumeClaimTemplates:
  - metadata:
      name: datadir
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "openebs-cstor-ssd"
      resources:
        requests:
          storage: 20Gi

service.yml

metadata:
  name: mariadb-galera
  namespace: {{ app_namespace }}
  labels:
    app: mariadb-galera
spec:
  ports:
  - port: 3306
    name: mysql
  type: LoadBalancer
  loadBalancerIP: 172.20.0.70
  selector:
    app: mariadb-galera-node
colinmollenhour commented 5 years ago

Hi David,

The entrypoint script is expecting the "command" to be "node" and then a comma-separated list of DNS names or IP addresses but you have command: [ "/bin/bash", "-c", "--" ] in your statefulset.yml.

I haven't learned Kubernetes yet so I'm not sure exactly what you would use for the second argument after "node" but I believe this explains why you're seeing the nonsensical arguments passed to socat. You want the "node" second argument to be something that will resolve to the IP of the seed if it exists and all of the IPs of the other nodes as well. E.g. in docker-compose this can be simply "seed,node" but I don't know what this would be for Kubernetes. I think the "node" Pods can be resolved with the service name, but I don't know how to resolve the "seed" as you have it setup as a Job instead of a Service so I don't know if it can even be discovered via DNS. Of course this all assumes your cluster has a DNS service.

davidschrooten commented 5 years ago

Ah cool I suppose we can create a service with type "ClusterIP" and select the seed pod with that. It will provide us with a internal dns name that we can then use for the statefulset.

The reason why I use [ "/bin/bash", "-c", "--" ] command is a little hack to execute multiple commands sequently. The storage provider that I use called openebs, and many others as well create a "lost+found" directory in the storage that is mounted. I just use it to do a rm -rf on that before initializing the container.

Anyway thanks for the nice container. Will post a working configuration set when I got it working.

davidschrooten commented 5 years ago

Seems to hang on sst:


2019-05-08 15:57:50 140672250140416 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
2019-05-08 15:57:50 140674653099328 [ERROR] WSREP: SST failed: 1 (Operation not permitted)
2019-05-08 15:57:50 140674653099328 [ERROR] Aborting
 /usr/local/bin/start.sh: line 416:    42 Aborted                 gosu mysql mysqld.sh --console $MYSQL_MODE_ARGS --wsrep_cluster_name=$CLUSTER_NAME --wsrep_cluster_address=gcomm://$GCOMM --wsrep_node_address=$NODE_ADDRESS:4567 --default-time-zone=$DEFAULT_TIME_ZONE "$@" 2>&1
MariaDB exited with return code (0)```
davidschrooten commented 5 years ago

Switched to SST method rsync. It seems to be working perfectly now. Need to verify readiness and liveness probe but will do that on friday. I will submit a PR when I am done, to be added to examples.

colinmollenhour commented 5 years ago

Sweet, I'm really excited to see a working Kubernetes example! Thanks for sharing!

Regarding the command hack, there is this feature that may work as an alternative:

If the file /usr/local/lib/startup.sh exists it will be sourced in the start.sh script.

davidschrooten commented 5 years ago

See https://github.com/colinmollenhour/mariadb-galera-swarm/pull/74

davidschrooten commented 5 years ago

Need to add a cron job for backups. Any good ideas how to quickly implement this. Or do I need to create a sidecar container that brings down the first node of the galera cluster, mounts the persistent volume and then runs xtrabackup?

colinmollenhour commented 5 years ago

Awesome!

One way to do a backup could be to spin up an extra "node", wait for Galera SST to complete, then stop the server and transfer the data before fully stopping the container. This way no nodes are taken down and Galera handles choosing the donor node automatically.

davidschrooten commented 5 years ago

https://github.com/davidq2q/k8s-galera-hook

If you have anything to add. Please do so, I will integrate it into a helm chart.

colinmollenhour commented 5 years ago

Cool script, looking forward to seeing the helm chart!

colinmollenhour commented 5 years ago

Any chance you've done a helm chart yet?

oliverlj commented 4 years ago

same issue on swarm, seed step was successfully

version: '3.8'

networks:
  back:
    driver: overlay
    driver_opts:
      encrypted: ''
  infra-secrets:
    external: true

secrets:
  xtrabackup_password:
    file: secrets/xtrabackup_password
    name: xtrabackup-password-{{ lookup('file', 'deployment/secrets/xtrabackup_password') | hash('sha1') }}

services:
  {{ service }}:
    image: "colinmollenhour/mariadb-galera-swarm:{{ galera_version }}"
    command: node tasks.{{ service }}
    stop_grace_period: 30s
    environment:
      - XTRABACKUP_PASSWORD_FILE=/run/secrets/xtrabackup_password
    healthcheck:
      start_period: 5m
    networks:
      - back
      - infra-secrets
    volumes:
      - {{ ha_dir }}/{{ stack_name }}-{{ service }}-{% raw %}{{.Task.Slot}}{% endraw %}-data:/var/lib/mysql
    deploy:
      replicas: {{ nodes_replicas }}
      update_config:
        delay: 1m
    secrets:
      - xtrabackup_password
oliverlj commented 4 years ago

i have done seed step again, now it is ok