Closed davidschrooten closed 2 years ago
Hi David,
The entrypoint script is expecting the "command" to be "node" and then a comma-separated list of DNS names or IP addresses but you have command: [ "/bin/bash", "-c", "--" ]
in your statefulset.yml
.
I haven't learned Kubernetes yet so I'm not sure exactly what you would use for the second argument after "node" but I believe this explains why you're seeing the nonsensical arguments passed to socat. You want the "node" second argument to be something that will resolve to the IP of the seed if it exists and all of the IPs of the other nodes as well. E.g. in docker-compose this can be simply "seed,node" but I don't know what this would be for Kubernetes. I think the "node" Pods can be resolved with the service name, but I don't know how to resolve the "seed" as you have it setup as a Job instead of a Service so I don't know if it can even be discovered via DNS. Of course this all assumes your cluster has a DNS service.
Ah cool I suppose we can create a service with type "ClusterIP" and select the seed pod with that. It will provide us with a internal dns name that we can then use for the statefulset.
The reason why I use [ "/bin/bash", "-c", "--" ] command is a little hack to execute multiple commands sequently. The storage provider that I use called openebs, and many others as well create a "lost+found" directory in the storage that is mounted. I just use it to do a rm -rf on that before initializing the container.
Anyway thanks for the nice container. Will post a working configuration set when I got it working.
Seems to hang on sst:
2019-05-08 15:57:50 140672250140416 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
2019-05-08 15:57:50 140674653099328 [ERROR] WSREP: SST failed: 1 (Operation not permitted)
2019-05-08 15:57:50 140674653099328 [ERROR] Aborting
/usr/local/bin/start.sh: line 416: 42 Aborted gosu mysql mysqld.sh --console $MYSQL_MODE_ARGS --wsrep_cluster_name=$CLUSTER_NAME --wsrep_cluster_address=gcomm://$GCOMM --wsrep_node_address=$NODE_ADDRESS:4567 --default-time-zone=$DEFAULT_TIME_ZONE "$@" 2>&1
MariaDB exited with return code (0)```
Switched to SST method rsync. It seems to be working perfectly now. Need to verify readiness and liveness probe but will do that on friday. I will submit a PR when I am done, to be added to examples.
Sweet, I'm really excited to see a working Kubernetes example! Thanks for sharing!
Regarding the command hack, there is this feature that may work as an alternative:
If the file /usr/local/lib/startup.sh exists it will be sourced in the start.sh script.
Need to add a cron job for backups. Any good ideas how to quickly implement this. Or do I need to create a sidecar container that brings down the first node of the galera cluster, mounts the persistent volume and then runs xtrabackup?
Awesome!
One way to do a backup could be to spin up an extra "node", wait for Galera SST to complete, then stop the server and transfer the data before fully stopping the container. This way no nodes are taken down and Galera handles choosing the donor node automatically.
https://github.com/davidq2q/k8s-galera-hook
If you have anything to add. Please do so, I will integrate it into a helm chart.
Cool script, looking forward to seeing the helm chart!
Any chance you've done a helm chart yet?
same issue on swarm, seed step was successfully
version: '3.8'
networks:
back:
driver: overlay
driver_opts:
encrypted: ''
infra-secrets:
external: true
secrets:
xtrabackup_password:
file: secrets/xtrabackup_password
name: xtrabackup-password-{{ lookup('file', 'deployment/secrets/xtrabackup_password') | hash('sha1') }}
services:
{{ service }}:
image: "colinmollenhour/mariadb-galera-swarm:{{ galera_version }}"
command: node tasks.{{ service }}
stop_grace_period: 30s
environment:
- XTRABACKUP_PASSWORD_FILE=/run/secrets/xtrabackup_password
healthcheck:
start_period: 5m
networks:
- back
- infra-secrets
volumes:
- {{ ha_dir }}/{{ stack_name }}-{{ service }}-{% raw %}{{.Task.Slot}}{% endraw %}-data:/var/lib/mysql
deploy:
replicas: {{ nodes_replicas }}
update_config:
delay: 1m
secrets:
- xtrabackup_password
i have done seed step again, now it is ok
Hello,
I am currently creating a statefullset of this image for usage on kubernetes. However I am stuck with above error. If you're willing to help out I can share a working configuration.
My configuration is as followed: job.yml
statefulset.yml
service.yml