FrederikNJS commented 8 years ago

To use this docker image in production, it would be very nice to be able to run it with replication, to maintain some redundancy.

The bitnami/mariadb image already does this, but they don't have the 10.x versions.

yosifkit commented 8 years ago

Would Galera cover that for you (https://github.com/docker-library/mariadb/pull/24) or are you looking at the autoconfiguration that is in their script (https://github.com/bitnami/bitnami-docker-mariadb/blob/master/rootfs/bitnami-utils-custom.sh)?

krasi-georgiev commented 8 years ago

I am on the same path

mariadb has this build it from 10.1 and after this is resolved https://github.com/docker-library/mariadb/issues/29 it would be nice to implement some sort of env to bootstrap only the first time the container is lunched using the mysqld --wsrep-new-cluster and then any additional container can join the cluster with mysqld --wsrep_cluster_address=gcomm://container_name

nazar-pc commented 8 years ago

If anyone is interested - this is not very straightforward, but doable. You can find scalable MariaDB image that requires no configuration here: https://github.com/nazar-pc/docker-webserver Please, read readme and advanced documentation carefully, it really works well. I'm working on GlusterFS image, so that literally everything will scale with zero configuration, but MariaDB works fine already. Comparing to Bitnami's images my images look a bit simpler and uses this MariaDB official image as base image. Take a look and let me know what you think.

BTW, I'm not sure that that PR with Galera support is really as good as it could be. Also I have such feeling, that official image should not provide Galera setup, it is tricky and will not be flexible enough eventually. Current image is good enough and contains most of necessary basic blocks to build Galera image yourself while nicely reusing what is done here already.

krasi-georgiev commented 8 years ago

thanks I will check it.

I will not advise you to go for the GlusterFS , especialy for web apps. I haven't tested it myself , but the feedback is that is quite slow for when accessing many small files.

I have setup 2x lsyncd container and so far it syncs really well. It is kind of a rsync daemon , watches for changes and it fires rsync

The documentation sais not fit for 2 way sync , but in my setup it works very , very well.

Here is the repo, It is not universal but you don't have time right now to improve it. https://github.com/vipconsult/dockerfiles/tree/master/lsyncd

I start is with the same compose file on both hosts

lsyncd:

so that can increase fs.inotify.max_user_watches

privileged: true
# without net host it cannot bind to the internal ip
net: host
volumes:
    # share the same keys on all containers so that no ssh config is needed
    - ../lsyncd/.ssh:/root/.ssh
    # the actual folders that will be synced
    -  /folder1ToShare:/sync/f1
    -  /folder2ToShare:/sync/.f2

it needs 2 env variables $INTERNAL_IP - ot binds to this ip $LB_SERVER - it connects to this ip on port 222

on host 1 $INTERNAL_IP - 172.0.0.10 $LB_SERVER - 172.0.0.20

on host 2 $INTERNAL_IP - 172.0.0.20 $LB_SERVER - 172.0.0.10

macropin commented 8 years ago

If you're going to support Galera, please be sure to include galera-arbitrator-3 (aka garbd) in future builds.

activatedgeek commented 7 years ago

The current 10.x builds already have the Galera plugin. Doesn't that work? I have been trying to debug that for quite a while now and the second node after the bootstrap node crashes saying MySQL init process failed. Here is an SO post for anybody who knows about the issue: http://stackoverflow.com/questions/39744949/unable-to-create-mariadb-galera-cluster.

klausenbusk commented 7 years ago

The current 10.x builds already have the Galera plugin. Doesn't that work? I have been trying to debug that for quite a while now and the second node after the bootstrap node crashes saying MySQL init process failed.

I have been using the MariaDB 10.1 with Galera for a long time (3+ months) and it works perfect.. Before that I used a home crafted image with MariaDB Galera (before it was merged into regular MariaDB) I start it as:

/usr/bin/docker run \
        --name mariadb-galera \
        --rm \
        -p 3306:3306 \
        -p ${COREOS_PRIVATE_IPV4}:4444:4444 \
        -p ${COREOS_PRIVATE_IPV4}:4567:4567/udp \
        -p ${COREOS_PRIVATE_IPV4}:4567-4568:4567-4568 \
        --dns 172.17.42.1 \
        -v /var/lib/mysql:/var/lib/mysql \
        -v /var/log/mysql:/var/log/mysql \
        mariadb:10.1 \
        --log-bin=mysqld-bin \
        --log-slave-updates \
        --binlog-format=row \
        --binlog-annotate-row-events \
        --innodb-autoinc-lock-mode=2 \
        --innodb-flush-log-at-trx-commit=0 \
        --slow-query-log \
        --wsrep-on="ON" \
        --wsrep-log-conflicts \
        --wsrep-slave-threads=4 \
        --wsrep-provider="/usr/lib/libgalera_smm.so" \
        --wsrep-cluster-address="gcomm://mysql.skydns.local,mysql.skydns.local" \
        --wsrep-node-address="${COREOS_PRIVATE_IPV4}" \
        --wsrep-node-name="%H" \
        --wsrep-sst-method="xtrabackup-v2" \
        --wsrep-sst-auth="${SST_AUTH}"

I run it on a CoreOS cluster with Skydns, and the server register itself with Skydns/etcd so it is available on "mysql.skydns.local"..

activatedgeek commented 7 years ago

@klausenbusk Would you mind quickly reviewing the configs present at the SO link I have provided? http://stackoverflow.com/questions/39744949/unable-to-create-mariadb-galera-cluster

klausenbusk commented 7 years ago

@klausenbusk Would you mind quickly reviewing the configs present at the SO link I have provided? http://stackoverflow.com/questions/39744949/unable-to-create-mariadb-galera-cluster

First you should use 10.1, secondly it doesn't seems like you expose any ports?, thirdly you need to set --wsrep-node-address.. :)

activatedgeek commented 7 years ago

@klausenbusk I am using inter-container networking so I don't think port exposure is needed. I am trying to run it on my local machine, I can worry about multi-host communication later. Also --wsrep-node-address is it necessary? According the logs of second node, it looks like the SST was completed successfully. Thanks! :)

klausenbusk commented 7 years ago

@activatedgeek Oh yes, I think you correct :) Do you have a full log of the secondary node?

activatedgeek commented 7 years ago

@klausenbusk Have a look here: http://pastebin.com/3exPpqvc. If you have a look at lines 215-217 you can see that the SST was completed successfully, but the node crashed due to failed init. Also line 220 shows port: 0, which is rather odd as on the bootstrap node it shows port: 3306.

klausenbusk commented 7 years ago

@klausenbusk Have a look here: http://pastebin.com/3exPpqvc. If you have a look at lines 215-217 you can see that the SST was completed successfully, but the node crashed due to failed init. Also line 220 shows port: 0, which is rather odd as on the bootstrap node it shows port: 3306.

The issues is that the image get stuck in the initializing phase (which is running with --skip-networking (explain port 0)). That the reason why I run mkdir -p /var/lib/mysql/mysql, before I start MariaDB, so It doesn't run the initializing code.

So try something like:

mkdir -p /var/lib/mysql/mysql
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \
  -v /var/lib/mysql:/var/lib/mysql \
  activatedgeek/mariadb:devel \
  --wsrep-cluster-name=test_cluster \
  --wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4

activatedgeek commented 7 years ago

@klausenbusk So essentially, you mean that start all nodes such that the folder /var/lib/mysql (or the data dir) should contain a mysql folder? This could either be done within the container or the mounted volume. Is that right?

klausenbusk commented 7 years ago

@klausenbusk So essentially, you mean that start all nodes such that the folder /var/lib/mysql (or the data dir) should contain a mysql folder? This could either be done within the container or the mounted volume. Is that right?

Correct expect for the bootstrap node, also you should consider using xtrabackup for SST.

activatedgeek commented 7 years ago

@klausenbusk Perfect. This is working great now. I'll create a wrapper MariaDB image which takes an extra environment flag to create the dir for non-bootstrap nodes. Thank you so much for your inputs!

yosifkit commented 7 years ago

I believe tanji had a working setup with this compose file: https://github.com/docker-library/mariadb/pull/57#issuecomment-226739810.

strarsis commented 6 years ago

Not sure if this relates to this issue, but I get this warning on new database initialization:

[Warning] Failed to load slave replication state from table mysql.gtid_slave_pos: 1146: Table 'mysql.gtid_slave_pos' doesn't exist

tanji commented 6 years ago

@strarsis Unrelated, please open a different issue for that

tianon commented 6 years ago

I think supporting replication out of the box is probably a bit too ambitious for this image (as noted by other folks above) -- there are too many edge cases and environmental configuration for us to do that really well in such a way that would satisfy not only existing replication users but also cover the use cases of new users.

jan-hudec commented 5 years ago

Severalnines have an example for docker-swarm in severalnines/galera-docker-mariadb. They also have example for kubernetes there, but I had problems with the etcd (there is only incubator chart for it and it seems broken), so I rewrote it to use labels in kubernetes instead, and used this image.

Notable differences are that they are still using xtracbackup-v2, while this image (only) has mariabackup, and different path to the galera_ssm library. Version 10.2 also changes the state variables, so the healthcheck scripts need to be rewritten as severalnines still have 10.1.

I used this image unmodified, just injecting a script wrapping the default entrypoint from the kubernetes manifest. It is kubernetes-specific, so appropriate place to host it would be the helm/charts.

I had one issue with the docker-entrypoint.sh though: I prefixed it with a bit that derives the appropriate --wsrep options, but passing them to the docker-entrypoint.sh as is does not work, because they must not be passed to mysql_install_db. I hacked around it, but it would be better to have a way to either distinguish the options in the default script, or to tell it to just initialise the database without the final exec, so the wrapper script would do that.

grooverdan commented 3 years ago

Most discussion here seems galera related.

MDEV-25667 recently got raised on getting some better support for doing galera in the server that will with hopefully minimal entrypoint changes provide the necessary functionality.

galera-arbitrator should be its own image rather than bloating this one.

tymonx commented 3 years ago

I have some little success in this area. My experiences:

It is possible to have a full automatic bootstrap for new Galera cluster in multi containers scenario without any manual action or interaction. In my case, I'm checking presents of the /var/lib/mysql/gvwstate.dat. To automatically determine the node used for bootstrapping I check container host name with provided list of Galera cluster nodes (hostnames).
It is also possible to have a full automatic node join during Galera cluster creation. This require two steps. Firs step is bootstrapping phase, second step is nodes joining order that can be achieved by looping through provided Galera cluster nodes (hostnames), checking Galera status for each node and waiting for synced state for each previous nodes in order (like chaining, node1 -> node2 -> ... -> nodeN) before running mysqld (joining to cluster).
The default docker-entrypoint.sh script used in the official MariaDB image is not perfect/not suitable in cluster replication. Main issues came from the... mysqld itself. For example, if someone will provide a custom galera.cnf MariaDB configuration file, Galera options will be used during invocation of the mysql_install_db command (it creates a temporary mysqld instance) and pre MariaDB configuration/setup (it also creates a temporary mysqld instance) called by the default docker-entrypoint.sh script. It can result to strange behaviors. To play safe, I have created my own version of the docker-entrypoint.sh script.

Working results can be found here: https://gitlab.com/tymonx/docker-mariadb. I have a simple Docker Compose example with automatic bootstrapping and automatic cluster nodes join. This should work out-of-box without requiring a third party (especially an additional cluster controller/manager). There are some TODOs that can be done like:

Automatically detecting Galera cluster Initialized state for all nodes, determine the most advanced node and execute automatic bootstrap.
Automatically detecting Initialized state for node(s) when quorum is still valid and kick node again to cluster.
My custom mysqld-entrypint script should have the same features/capabilities like the default docker-entrypoint.sh script or be merged into it.
Not fully tested but results look promising.

grooverdan commented 3 years ago

I have some little success in this area. My experiences:

* It is possible to have a full automatic bootstrap for new Galera cluster in multi containers scenario without any manual action or interaction. In my case, I'm checking presents of the `/var/lib/mysql/gvwstate.dat`. To automatically determine the node used for bootstrapping I check container host name with provided list of Galera cluster nodes (hostnames).

node determination probably should use https://galeracluster.com/library/documentation/system-tables.html (galera-4 (mariadb-10.4+) only). I think at this late stage of adding support starting with 10.4 to keep some scripting simpler is a good move.

* It is also possible to have a full automatic node join during Galera cluster creation. This require two steps. First step is bootstrapping phase, second step is nodes joining order that can be achieved by looping through provided Galera cluster nodes (hostnames), checking Galera status for each node and waiting for synced state for each previous nodes in order (like chaining, `node1 -> node2 -> ... -> nodeN`) before running `mysqld` (joining to cluster).

This sounds like too much work, a JOINING node will just need to start and then the SST will happen once a donor is available. Does this not happen? There's probably some timeout value needed here.

* The default `docker-entrypoint.sh` script used in the official MariaDB image is not perfect/not suitable in cluster replication. Main issues came from the... `mysqld` itself. For example, if someone will provide a custom `galera.cnf` MariaDB configuration file, Galera options will be used during invocation of the `mysql_install_db` command (it creates a temporary `mysqld` instance)

Not quite: mysql_install_db runs mysqld called with --bootstrap which disables wsrep (ref)

and pre MariaDB configuration/setup (it also creates a temporary mysqld instance) called by the default docker-entrypoint.sh script.

very recently fixed with #358 (though no release yet)

It can result to strange behaviors. To play safe, I have created my own version of the docker-entrypoint.sh script.

Fair call, keeps it simpler.

Working results can be found here: https://gitlab.com/tymonx/docker-mariadb. I have a simple Docker Compose example with automatic bootstrapping and automatic cluster nodes join. This should work out-of-box without requiring a third party (especially an additional cluster controller/manager). There are some TODOs that can be done like:
* Automatically detecting Galera cluster `Initialized` state for all nodes, determine the most advanced node

This is the tricky bit right, which by latest wsrep_recover sequence number (alternate gtid, innodb lsn)? Is this in the system tables? If two nodes both have the latest sequence number is the entrypoint setting the pc.bootstrap=true sufficient? Or do they need to co-ordinate?

and execute automatic bootstrap.

Take a look through https://galeracluster.com/library/documentation/crash-recovery.html and https://galeracluster.com/library/documentation/system-tables.html

Shouldn't this be something that uses wsrep_recover / wsrep_start_position same as the galera_new_cluster on the most advanced node?

* Automatically detecting `Initialized` state for node(s) when quorum is still valid and kick node again to cluster.

I'm also thinking this should be there already.

* My custom `mysqld-entrypoint` script should have the same features/capabilities like the default `docker-entrypoint.sh` script or be merged into it.

At the movement I'm seeing a need for an env flag GALERA_AUTO_BOOTSTRAP, and maybe GALERA_AUTO_RECOVER?

* Not fully tested but results look promising.

Note the case: https://galeracluster.com/library/documentation/crash-recovery.html?highlight=power%20failure

Documentation:

https://github.com/docker-library/docs/tree/master/mariadb

Test cases:

I've got some work in progress tests cases but please consider what would be needed to test these:

https://github.com/grooverdan/official-images/commits/mariadb-tests

tymonx commented 3 years ago

Hi @grooverdan. Thanks for sharing, it will help me. I recently started my adventure with Galera cluster replication.

node determination probably should use https://galeracluster.com/library/documentation/system-tables.html (galera-4 (mariadb-10.4+) only). I think at this late stage of adding support starting with 10.4 to keep some scripting simpler is a good move.

This looks nice and very helpful. It seems that I must start a temporary mysqld instance (with --no-defaults --skip-networking --wsrep-on=off), query these tables and decide if node was already initialized and joined to skip adding --wsrep-new-cluster flag and to skip joining order procedure at first time.

This sounds like too much work, a JOINING node will just need to start and then the SST will happen once a donor is available. Does this not happen? There's probably some timeout value needed here.

At first time I have trying to run N containers with MariaDB replication (after adding automatic one-shot bootstrapping feature for one of nodes) without any logic and control I have noticed strange behaviors on many tries (dozens). At first I thought Galera could handle it itself (auto-join at first time of N asynchronous and parallel started containers). But no. Sometimes it works, sometimes a node was always restarting or be forever in Initialized state or all nodes have gone to Initialized state and cluster was not operational at all. Every attempts were on fresh start using Docker Compose and on the currently newest MariaDB version (10.6). Some articles that I have found was pointing that the initialization order procedure is very important. First start bootstrapping for the first node, after success, go to next node to start it (first joining). Only when successfully finishing, go to next node to start next join and repeat it for remaining nodes. It sounded for me like the order matters. After implementing this logic in my custom entrypoint script (the cluster_node_join function keeps sequence of first joining), I got 100% success on every attempts. In my experience it seems that Galera could not handle it properly (asynchronous and parallel joining of many nodes at first time).

Not quite: mysql_install_db runs mysqld called with --bootstrap which disables wsrep (ref)

Yes, you are right. My bad :( Logs were misleading me. They came only from temporary created mysqld instance inside the default docker-entrypoint.sh.

very recently fixed with #358 (though no release yet)

Good information :)

This is the tricky bit right, which by latest wsrep_recover sequence number (alternate gtid, innodb lsn)? Is this in the system tables? If two nodes both have the latest sequence number is the entrypoint setting the pc.bootstrap=true sufficient? Or do they need to co-ordinate?

Thanks for pointing this. Still, I would try to implement... somehow :) As an optional feature that can be enabled or disabled. I like an out-of-box solutions, fire and forget. I was thinking about creating a small semi-intelligent service in Go and run it parallel with the mysqld daemon.

Your information helps me a lot. Thanks! :)

grooverdan commented 3 years ago

Does wsrep_notify_cmd provide the interface needed to autonomously recover from the states?

This would make a user opt in - but setting the user setting mariadb config option for wsrep_notify_cmd (and other galera bits) which they'd be doing it anyway in their configuration. Given automation may already exist on the mariadb container I'd rather not surprise people.

tymonx commented 3 years ago

It will definitely help.

This would make a user opt in - but setting the user setting mariadb config option for wsrep_notify_cmd (and other galera bits) which they'd be doing it anyway in their configuration.

This can be solved. By detecting this and creating a script wrapper that will execute both (or more) scripts, the current one (always forked) and custom one provided by user. It is similar to observer pattern from software when you register N custom callbacks and fire them in loop for every event/notification. This case also happens in classic signal (C?) or trap (POSIX scripts) mechanism in application and scripts. And using the observer pattern helps to resolve that.

tymonx commented 3 years ago

After some thinking, I have a nice proposition for the wsrep_notify_cmd improvement. This parameter should accept a list of scripts separated with comma , :) Or multi invocation of the --wsrep-notify-command command line parameter. As I assume, current implementation of the the wsrep_notify_cmd parameter handles only a single command/script.

grooverdan commented 3 years ago

yes, single command (implementation of execution)

grooverdan commented 3 years ago

I think you're over engineering it (MDEV-25742). If a user wants their own notification script as well as what you what you develop, they can wrap it in a script themselves. If really needed its a rather simple shell script to fan out an reap, lets not burden the server the the added complexity.

tymonx commented 3 years ago

If you think this feature will not help or make something easier I can close it. At beginning the idea sounds good :)

I have prepared a simple test. When I was using official image 10.6 initial auto-joining doesn't work as I have mentioned before. But replacing the default docker-entrypoint.sh from the official mariadb:10.6 image with the newest docker-entrypoint.sh from the master branch it started working. For now I have ran both (original and updated) several times. I suspect that this fix #358 helped. Temporary created mysqld instances read and was trying to attempt configuration settings for Galera and mess up with joining sequence. I think I can now drop my silly idea of keeping initial sequence order of nodes when auto-creating a cluster :)

My tests can be found here: https://gitlab.com/tymonx/docker-mariadb/-/tree/dev-mariadb-docker-initial-auto-join-failure

If you want look into CI logs:

first CI job docker-test 1/2 run official mariadb:10.6 image with the original docker-entrypoint.sh script. And some nodes fails randomly: https://gitlab.com/tymonx/docker-mariadb/-/jobs/1279739700#L1015 (line 1015)
second CI job docker-test 2/2 run official mariadb:10.6 image with updated docker-entrypoint.sh script from the master branch: https://gitlab.com/tymonx/docker-mariadb/-/jobs/1279739702

tymonx commented 3 years ago

I think if you finish to implement this feature (MDEV-25667) and add support for the GALERA_AUTO_BOOTSTRAP and maybe optionally the GALERA_AUTO_RECOVER environment variables it should be quiet enough :)

tymonx commented 3 years ago

Good news! After updating the docker-entrypoint.sh script with the newest one, everything works beautifully :)

I have added only the auto bootstrap feature.

My working replication support https://gitlab.com/tymonx/docker-mariadb:

based on the original MariaDB image 10.6 with updated the docker-entrypoint.sh script
simple custom mysqld-entrypoint script that can be merged later into the the default docker-entrypoint.sh script. @grooverdan I can prepare a proper Pull Request for that
it automatically detects if the wsrep_on parameter was enabled based on mysqld configuration files or provided command line arguments
parsing the wsrep_cluster_address parameter from mysqld configuration files or provided command line arguments or set by new environment variable CLUSTER_ADDRESS to automatically determine node used for cluster bootstrapping
on default the auto cluster bootstrapping feature is enabled and it uses the first node from the wsrep_cluster_address parameter. This can be disabled with the CLUSTER_AUTO_BOOTSTRAP=<0|NO|OFF|FALSE|DISABLE> environment variable. It also possible to select other node with the CLUSTER_BOOTSTRAP_ADDRESS=<hostname|ip> environment variable.

grooverdan commented 3 years ago

quick note: 6f5d272ca053105901c23fcdc44884907aa4d11d is the cause of the timezone initialization failures in your CI. A 10.6.1 release is imitate. You want work around it with MARIADB_INITDB_SKIP_TZINFO=1. Looking at other comments now.

Chat available on https://mariadb.zulipchat.com

tymonx commented 3 years ago

Thanks! I have recently tested with Docker Compose and Docker Swarm. Current mysqld-entrypoint implementation.

Docker Compose scaling:

CLUSTER_ADDRESS="gcomm://db_node_1,db_node_2,db_node_3,db_node_4,db_node_5"
docker-compose --project-name db up --scale node="$(echo "${CLUSTER_ADDRESS}" | tr ',' ' ' | wc -w)"

tymonx commented 3 years ago

Add scaling using external mysql configuration file (first it needs be added to the volumes: section):

docker-compose --project-name db up --scale node="$(grep -i wsrep_cluster_address <name>.cnf | tr -d ' ' | tr ',' ' ' | wc -w)"

grooverdan commented 3 years ago

ok. Looking forward to a PR.

Notes based on current entrypoint:

please make better use of bash string functions rather than sed/grep -q and obvious the existing functions within the entrypoint.
for consistent naming use WSREP_{AUTO_BOOTSTRAP,BOOTSTRAP_ADDRESS,*} if needed rather than GALERA/CLUSTER.
existing variables are just checked with [ -n "$MARIADB_PASSWORD" ] for enabled so lets keep it simple there rather than a proliferation of accepted options that only apply to these new variables "0\|no\|off\|false\|disable"
I haven't see much ${@:+$@} use before. My bash knowledge isn't perfect however "$@" is sufficient I suspect.
When you have a PR ready, can you show the delta to a different branch would show what it is like with MDEV-25667 ready. The currently is_new_cluster simplicity makes it look like its not needed.
I'm still worried about auto-enable messing with an existing user that has something working. Maybe a WSREP_AUTOBOOTSTRAP={nodename} to identify the node to wsrep-new-cluster on.

tymonx commented 3 years ago

Sure :) I will fork this project and prepare a proper PR including your suggestions.

I haven't see much ${@:+$@} use before. My bash knowledge isn't perfect however "$@" is sufficient I suspect.

It is related with the SC2068 warning. The ${@:+$@} construct only silent the ShellCheck linter warning (or error) for $@ rather putting everywhere # shellcheck disable=SC2068. You are right, in this case using the "$@" construct is sufficient.

tymonx commented 3 years ago

@grooverdan done. Ready for review :)

tymonx commented 3 years ago

I'm still worried about auto-enable messing with an existing user that has something working. Maybe a WSREP_AUTOBOOTSTRAP={nodename} to identify the node to wsrep-new-cluster on.

I have added two useful environment variables WSREP_SKIP_AUTO_BOOTSTRAP and WSREP_AUTO_BOOTSTRAP_ADDRESS=<ip|hostname>.

I can reverse the logic and disable the auto bootstrap feature on default. Then I can replace the WSREP_SKIP_AUTO_BOOTSTRAP environment variable with the WSREP_AUTO_BOOTSTRAP. If user will provide Galera configuration and set the wsrep_cluster_address parameter (or use --wsrep-cluster-address or WSREP_CLUSTER_ADDRESS), it must also explicitly set the WSREP_AUTO_BOOTSTRAP=1.

chengkuangan commented 2 years ago

The existing image is supporting Galera. The Galeria library is included in the image. I have made it works for my NextCloud. I even have the maxscale running in front of these cluster nodes.

I am using k8s (on RPI4s :-) ) so I will just highlight those changes needed and I believe you can easily replicates to docker and so on.

You need to run the container with the following command. This is the syntax for YAML, should be very similar to docker compose. Use the following to start the master node. Launch your slave nodes using standard approach as what you always do. Best to wait for your master node to be ready then only launching the slave nodes.

command: ["mariadbd"]
args: ["--user=mysql", "--wsrep-new-cluster"]

The following is the config required in mariadb.cnf on top of whatever you have now. This should be the same for all your MariaDB instances.

[galera]
    # Mandatory settings
    wsrep_on=ON
    # this is the correct path for the library. I am using image 10.7.3
    wsrep_provider=/usr/lib/galera/libgalera_smm.so
    #add your node ips here
    # make sure these are IPs or Resolvable DNS/domain names
    wsrep_cluster_address="gcomm://mariadb1,mariadb2,mariadb3"
    binlog_format=row
    default_storage_engine=InnoDB
    innodb_autoinc_lock_mode=2
    #Cluster name
    wsrep_cluster_name="nc-mariadb_cluster"
    # Allow server to accept connections on all interfaces.
    bind-address=0.0.0.0
    # this server ip, change for each server
    # this server name, change for each server
    wsrep_node_name="mariadb1"
    wsrep_sst_method=rsync
    innodb_doublewrite=1

chengkuangan commented 2 years ago

Screenshot 2022-03-11 at 7 52 26 PM

skjnldsv commented 2 years ago

@chengkuangan But then you have to edit the config every time you want to add a new node, no? How did you made server1 and server3 slaves ? Is this just a multi-master galera with a maxscale readwrite split to create a master-slave setup?

If you stop the cluster, how do you restart it again? For me it will always complain, since the cluster already have been initialized and --wsrep-new-cluster is not really relevant here, right?

Thanks for taking the time to share this with us :)

chengkuangan commented 2 years ago

@skjnldsv Galera configure master-master cluster. The master-slave you see on the maxscale is because of readwrite split (which required by NextCloud in my case).

From what I read if we stop the cluster completely, it considered terminated, we will need to bootstrap the cluster again. Which is mentioned in the doc. It will be a manual step.

Thanks for pointing of removing the --wsrep-new-cluster. It is only needed for the first node bootstrap.

To be frank, I am not an expert here... This is my first time setting up MariaDB cluster and maxscale. and it is RPI4 playground ... so no guaranteed I am 100% correct. I am just sharing what I have learned and subject to mistake. :-)

chengkuangan commented 2 years ago

The existing image is supporting Galera. The Galeria library is included in the image. I have made it works for my NextCloud. I even have the maxscale running in front of these cluster nodes.

I am using k8s (on RPI4s :-) ) so I will just highlight those changes needed and I believe you can easily replicates to docker and so on.

You need to run the container with the following command. This is the syntax for YAML, should be very similar to docker compose. Use the following to start the master node. Launch your slave nodes using standard approach as what you always do. Best to wait for your master node to be ready then only launching the slave nodes.
command: ["mariadbd"]
args: ["--user=mysql", "--wsrep-new-cluster"]
The following is the config required in mariadb.cnf on top of whatever you have now. This should be the same for all your MariaDB instances.
[galera]
    # Mandatory settings
    wsrep_on=ON
    # this is the correct path for the library. I am using image 10.7.3
    wsrep_provider=/usr/lib/galera/libgalera_smm.so
    #add your node ips here
    # make sure these are IPs or Resolvable DNS/domain names
    wsrep_cluster_address="gcomm://mariadb1,mariadb2,mariadb3"
    binlog_format=row
    default_storage_engine=InnoDB
    innodb_autoinc_lock_mode=2
    #Cluster name
    wsrep_cluster_name="nc-mariadb_cluster"
    # Allow server to accept connections on all interfaces.
    bind-address=0.0.0.0
    # this server ip, change for each server
    # this server name, change for each server
    wsrep_node_name="mariadb1"
    wsrep_sst_method=rsync
    innodb_doublewrite=1 

I forgot to mention. I have to initiate all the PODs first time without galera first. When the pod is ready, I will change it to enable galena and restart it.

Reason of doing so is if galera is enabled the first time, the instance will failed with error complaining mysql tables and other files are not available. I guess this maybe because the docker-entry point.sh is not designed for galera in the first boot and the database server is not initiated.

chengkuangan commented 2 years ago

@skjnldsv Galera configure master-master cluster. The master-slave you see on the maxscale is because of readwrite split (which required by NextCloud in my case).

From what I read if we stop the cluster completely, it considered terminated, we will need to bootstrap the cluster again. Which is mentioned in the doc. It will be a manual step.

Thanks for pointing of removing the --wsrep-new-cluster. It is only needed for the first node bootstrap.

To be frank, I am not an expert here... This is my first time setting up MariaDB cluster and maxscale. and it is RPI4 playground ... so no guaranteed I am 100% correct. I am just sharing what I have learned and subject to mistake. :-)

oh another thing, I have not looked hard enough maybe ... so far I can't find the grastate.dat file mentioned in the doc in order to change the safe_to_bootstrap=1. so I am domed if my cluster crashed or completely shutdown.

Any idea?

grooverdan commented 2 years ago

There's some notes started on grastate.dat and documentation references on MDEV-25855. Insights and corrections welcome.

chengkuangan commented 2 years ago

grastate.dat

@grooverdan To verify ... grastate.dat will not be created if the cluster is gracefully shutdown, am I correct?

skjnldsv commented 2 years ago

@grooverdan To verify ... grastate.dat will not be created if the cluster is gracefully shutdown, am I correct?

You can generate it with --wsrep-recover Then edit the grastate.dat and start again without --wsrep-recover

grooverdan commented 2 years ago

I'm still learning the mechanics of the mechanisms available

MariaDB / mariadb-docker

Support Galera Replication #28

so that can increase fs.inotify.max_user_watches