leginon-org / leginon-redmine-archive

1 stars 0 forks source link

Create a MariaDB Galera multi-master Cluster using Docker #5951

Open leginonbot opened 5 months ago

leginonbot commented 5 months ago

Author Name: Neil Voss (@vosslab) Original Redmine Issue: 5951, https://emg.nysbc.org/redmine/issues/5951 Original Date: 2018-08-16 Original Assignee: Neil Voss


MariaDB Galera Cluster is a synchronous multi-master cluster for MariaDB. It is available on Linux only, and only supports the XtraDB/InnoDB storage engines (although there is experimental support for MyISAM - see the wsrep_replicate_myisam system variable).

https://mariadb.com/kb/en/library/what-is-mariadb-galera-cluster/

leginonbot commented 5 months ago

Original Redmine Comment Author Name: Neil Voss (@vosslab) Original Date: 2018-08-16T17:23:21Z


trying this tutorial to start:

http://galeracluster.com/documentation-webpages/docker.html

leginonbot commented 5 months ago

Original Redmine Comment Author Name: Neil Voss (@vosslab) Original Date: 2018-08-16T18:31:53Z


For the Galera setup, it is recommended to use at least version 10 of MariaDB, which is not native to CentOS 7 (or 6), so it is not the best choice if we stick to CentOS, unless we wait for CentOS 8 to come out, anticipated in Summer 2019. I probably going to test with Debian 9 Stretch for the time being. As we will probably not implement this by the next year anyway.

Starting with MariaDB 10.1, the wsrep API for Galera Cluster is included by default. This is available as a separate download for MariaDB 10.0 and MariaDB 5.5.

Pretty much every guide to setpu Galera on CentOS 7, requires deleting the native MariaDB and installing a different repo. Always concerning with broken updates, etc.

leginonbot commented 5 months ago

Original Redmine Comment Author Name: Neil Voss (@vosslab) Original Date: 2018-08-16T20:30:26Z


Some interesting notes:

Gaps in auto-increment values:

Do not rely on auto-increment values to be sequential. Galera uses a mechanism based on autoincrement increment to produce unique non-conflicting sequences, so on every single node the sequence will have gaps. See http://codership.blogspot.com/2009/02/managing-auto-increments-with-multi.html

Load balancing: http://galeracluster.com/documentation-webpages/loadbalancing.html

Commit delays:

Synchronous replication works fine over the WAN network. There will be a delay, which is proportional to the network round trip time (RTT), but it only affects the commit operation.

Not sure what this means: https://mariadb.com/kb/en/library/tips-on-converting-to-galera/

There is one latency hit per COMMIT or autocommit. So, combining statements will decrease those hits. On the other hand, it is unwise (for other reasons) to make huge transactions, such as inserting/modifying millions of rows in a single transaction.

Hmm, this is interesting. If I make the same query twice, does it slow down?

Query cache cannot be used in a Galera context.

Query cache is available on newer versions: https://mariadb.com/kb/en/library/query-cache/

The query cache also needs to be disabled for MariaDB Galera cluster versions prior to "5.5.40-galera", "10.0.14-galera" and "10.1.2".

Prior to MariaDB Galera Cluster versions 5.5.40-galera and 10.0.14-galera, the query cache needed to be disabled.

Adding Another Node to a Cluster: https://mariadb.com/kb/en/library/getting-started-with-mariadb-galera-cluster/

Once you have a cluster running and you want to add/reconnect another node to it, you must supply an address of one of the cluster members in the cluster address URL. The new node only needs to connect to one of the existing members. It will automatically retrieve the cluster map and reconnect to the rest of the nodes, of course it's better to list all nodes of the cluster so that any node can join a cluster connecting to any other node, even if one or more are down.

Once all members agree on the membership, state exchange will be initiated during which the new node will be informed of the cluster state. If its state is different from that of the cluster (which is normally the case) it will request a snapshot of the state from the cluster[1]) and install it before becoming ready for use.

Performance: https://mariadb.com/kb/en/library/mariadb-galera-cluster-known-limitations/

by design performance of the cluster cannot be higher than performance of the slowest node; however, even if you have only one node, its performance can be considerably lower comparing to running the same server in a standalone mode (without wsrep provider). It is particularly true for big enough transactions.

leginonbot commented 5 months ago

Original Redmine Comment Author Name: Neil Voss (@vosslab) Original Date: 2018-08-16T20:37:55Z


Run a test case, modified from: http://galeracluster.com/2015/05/getting-started-galera-with-docker-part-1/

docker pull erkules/galera

docker run --detach=true --name node1 -h node1 erkules/galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://

docker run --detach=true --name node2 -h node2 --link node1:node1 erkules/galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1

docker run --detach=true --name node3 -h node3 --link node1:node1 erkules/galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1

docker exec -ti node1 mysql -e "SHOW STATUS LIKE 'wsrep_cluster_%'"

I managed to log into node1:

docker exec -t -i node1 bash

apt update
apt install wget

wget -c \
'http://emg.nysbc.org/redmine/projects/appion/repository/revisions/trunk/raw/appion/docker/centos7/docker-innodb.sql'

mysql -u root < docker-innodb.sql

exit;

log into node2 and check databases:

docker exec -ti node2 mysql -u usr_object --password=Phys-554 -e 'SHOW DATABASES;'

docker exec -ti node3 mysql -u usr_object --password=Phys-554 -e 'SHOW DATABASES;'
leginonbot commented 5 months ago

Original Redmine Comment Author Name: Neil Voss (@vosslab) Original Date: 2018-08-17T15:15:52Z


Run a test case, modified from: http://galeracluster.com/2015/05/getting-started-galera-with-docker-part-1/

docker build -t galera -f galera.Dockerfile .

docker network create my-net

docker run --detach=true --name node1 -h node1 --network my-net galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://

for i in {2..5}; do \
docker run --detach=true --name node${i} -h node${i} --network my-net galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1 ; \
sleep 1; done

#wait a few seconds for boot up
docker exec -ti node1 mysql -e "SHOW STATUS LIKE 'wsrep_cluster_%'"
+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_size       | 5                                    |
+--------------------------+--------------------------------------+

I managed to log into node1:

docker exec -t -i node1 bash

time mysql -u root < /docker-innodb.sql

exit;

log into node2 and check databases:

docker exec -ti node2 mysql -u usr_object --password=Phys-554 -e 'SHOW DATABASES;'

docker exec -ti node3 mysql -u usr_object --password=Phys-554 -e 'SHOW DATABASES;'
leginonbot commented 5 months ago

Original Redmine Comment Author Name: Neil Voss (@vosslab) Original Date: 2018-08-17T18:45:10Z


database creation/insertion time:

see full timing in attached image. Database schema creation takes the most time.

leginonbot commented 5 months ago

Original Redmine Comment Author Name: Neil Voss (@vosslab) Original Date: 2018-08-17T20:32:45Z


I am confident, I could get this working. We do need to implement the InnoDB style of table and make sure that FOREIGN KEY for REF|* columns are setup correctly.

leginonbot commented 5 months ago

Original Redmine Comment Author Name: Neil Voss (@vosslab) Original Date: 2019-08-15T20:07:30Z


Note: This distributed database, requires at least MariaDB 5.5, which is available on CentOS 7. You cannot set this up easily on CentOS 6.