colinmollenhour / mariadb-galera-swarm

MariaDb Galera Cluster container based on official mariadb image which can auto-bootstrap and recover cluster state.
https://hub.docker.com/r/colinmollenhour/mariadb-galera-swarm
Apache License 2.0
217 stars 102 forks source link

Very weird issue execute set_wsrep_myisam using @toggle; #25

Closed ghost closed 6 years ago

ghost commented 6 years ago

I followed the instructions completely. I added an external network. But when I startup I get the following for the seed:

2017-09-12 11:36:38 139693735663552 [Note] mysqld (mysqld 10.1.26-MariaDB-1~jessie) starting as process 26 ...
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: The InnoDB memory heap is disabled
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Compressed tables use zlib 1.2.8
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Using Linux native AIO
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Using SSE crc32 instructions
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Initializing buffer pool, size = 256.0M
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Completed initialization of buffer pool
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Highest supported file format is Barracuda.
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: 128 rollback segment(s) are active.
2017-09-12 11:36:39 139693735663552 [Note] InnoDB: Waiting for purge to start
2017-09-12 11:36:39 139693735663552 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.36-82.1 started; log sequence number 1616819
2017-09-12 11:36:39 139692913719040 [Note] InnoDB: Dumping buffer pool(s) not yet started
2017-09-12 11:36:39 139693735663552 [Note] Plugin 'FEEDBACK' is disabled.
2017-09-12 11:36:39 139693735663552 [Note] Server socket created on IP: '::'.
2017-09-12 11:36:39 139693735663552 [Warning] 'user' entry 'root@3bc3a4b14a30' ignored in --skip-name-resolve mode.
2017-09-12 11:36:39 139693735663552 [Warning] 'proxies_priv' entry '@% root@3bc3a4b14a30' ignored in --skip-name-resolve mode.
2017-09-12 11:36:39 139693735663552 [Note] WSREP: Read nil XID from storage engines, skipping position init
2017-09-12 11:36:39 139693735663552 [Note] WSREP: wsrep_load(): loading provider library 'none'
2017-09-12 11:36:39 139693734796032 [Warning] 'user' entry 'root@3bc3a4b14a30' ignored in --skip-name-resolve mode.
2017-09-12 11:36:39 139693734796032 [Warning] 'proxies_priv' entry '@% root@3bc3a4b14a30' ignored in --skip-name-resolve mode.
ERROR: 1064  You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'execute set_wsrep_myisam using @toggle;' at line 1
2017-09-12 11:36:39 139693735663552 [Note] mysqld: ready for connections.
Version: '10.1.26-MariaDB-1~jessie'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
colinmollenhour commented 6 years ago

Googling this brings up that it comes from mysql_tzinfo_to_sql: https://jira.mariadb.org/browse/MDEV-11653

This is used in the "bootstrap" step but I'm not sure why this is suddenly an error.. Perhaps the init should happen before the cluster starts but I'm hesitant to make any major changes.Commenting out the line with mysql_tzinfo_to_sql in start.sh should bypass it for you but then I think you won't have the timezone info.

ghost commented 6 years ago

In this post: https://withblue.ink/2016/03/09/galera-cluster-mariadb-coreos-and-docker-part-1.html

I alsp found some info. They use it also, and they say that they do it because myisam isn't supported. So I was wondering if it is true that myisam isn't supported.

Verstuurd vanaf mijn iPhone

Op 12 sep. 2017 om 19:39 heeft Colin Mollenhour notifications@github.com het volgende geschreven:

Googling this brings up that it comes from mysql_tzinfo_to_sql: https://jira.mariadb.org/browse/MDEV-11653

This is used in the "bootstrap" step but I'm not sure why this is suddenly an error.. Perhaps the init should happen before the cluster starts but I'm hesitant to make any major changes.Commenting out the line with mysql_tzinfo_to_sql in start.sh should bypass it for you but then I think you won't have the timezone info.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ghost commented 6 years ago

On hub.dockerhub.com the image colinmollenhour/mariadb-galera-swarm is last pushed 4 days ago. The latest github commit was 30 may. Is there a difference between the versions, because if I use the hub.dockerhub.com version it works. But if I clone the github.com repo the seed won't come healthy.

colinmollenhour commented 6 years ago

Yes, MyISAM is definitely not replicated so perhaps the tz tables loading should be moved to the do_install_db function in mysqld.sh...

Regarding the hub.docker.com versions, I think the images can be rebuilt automatically when an upstream image is rebuilt, for example if debian:jessie is updated. This is good because it allows you to easily get security patches, but in this case it seems like there is a regression... My "latest" tag is based on mariadb:10.1 which was recently updated to 10.1.26. I also have specific versions tagged that don't get auto-updated but as you can see I don't tag these often. (e.g. latest is 10.1.24-2017-06-12)

However, between the hub.docker.com version and building your own there really shouldn't be any difference unless you are building with an old version of mariadb:10.1.

colinmollenhour commented 6 years ago

Started on a fix in branch "fix-tzinfo-error" but have not tested yet. If you get a chance to test it please let me know if it fixes the problem.

colinmollenhour commented 6 years ago

Are you by any chance using the /docker-entrypoint-initdb.d/* mount point to run additional scripts?

ghost commented 6 years ago

The problem I was having for not starting was due to the permissions of the .cnf files, because I am using vagrant it gives all these files world permissions and this is not allowed somehow. So I added a RUN chmod -R 755 /etc/mysql/conf.d to the Dockerfile.

The fix you gave for the tz_info gives me the following error:

+ local pid=145
+ mysqld --skip-networking --skip-grant-tables --socket=/tmp/mysql.sock
+ mysql --protocol=socket -uroot -hlocalhost --socket=/tmp/mysql.sock mysql
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2 "No such file or directory")
+ mysql_tzinfo_to_sql /usr/share/zoneinfo
+ kill -s TERM 145
+ wait 145
/usr/local/bin/mysqld.sh: line 23:   145 Terminated              mysqld --skip-networking --skip-grant-tables --socket=/tmp/mysql.sock
+ echo '===|mysqld.sh|===: Loading tzinfo failed.'
===|mysqld.sh|===: Loading tzinfo failed.
+ exit 1
+ true
+ RC=0
+ echo 'MariaDB exited with return code (0)'
MariaDB exited with return code (0)

When I use without tz_info_fix and run mysqld --skip-networking --skip-grant-tables --socket=/tmp/mysql.sock & I get the following error.

root@afbd9c163a85:/# mysqld --skip-networking --skip-grant-tables --socket=/tmp/mysql.sock &
[1] 208
root@afbd9c163a85:/# 2017-09-13 12:59:23 139852663388096 [Note] mysqld (mysqld 10.1.26-MariaDB-1~jessie) starting as process 208 ...
mysqld: Please consult the Knowledge Base to find out how to run mysqld as root!
2017-09-13 12:59:23 139852663388096 [ERROR] Aborting

There's another thing that is bugging me. I have a multi-node swarm running. When I use:

version: '3.3'
services:
  ...
  database-seed:
    ...
    volumes:
    - database-data:/var/lib/mysql
    ...
volumes:
  database-data:
    driver: local
  ...

It also won't startup. If I remove the volumes from the database-seed it is working. How do I attach storage? What is the best approach to use volumes for a multi node galera cluster.

ghost commented 6 years ago

In your fix, the mysqld --user=root --skip-networking --skip-grant-tables --socket=/tmp/mysql.sock & seems not to be ready while the mysql --protocol=socket -uroot -hlocalhost --socket=/tmp/mysql.sock mysql is already executed. There should be a wait or somehow in it.

colinmollenhour commented 6 years ago

Ahh, of course.. I wanted to use mysql_embedded but it doesn't seem to be present in the container for some reason. Probably just a 3 second sleep would be sufficient since it only happens on an empty database.

If you want to test the mysqld command you need to use gosu mysql before it. The entire mysqld.sh script is run with gosu so within the script gosu doesn't need to be run again but if you docker run or docker exec then you will be root and need to use gosu.

E.g.: gosu mysql mysqld --skip-networking --skip-grant-tables --socket=/tmp/mysql.sock &

I've not used Docker Swarm in a while so I don't know what the issue with volumes. You do want the node and the seed to use the same volume though.. You could try a mount point like '/var/lib/mysql:/var/lib/mysql` but if that works and the local volume doesn't I don't know why offhand.

I pushed another commit fixing the file permissions and adding a 3 second delay before loading tzinfo.

ghost commented 6 years ago

If I have more than 2 physical hosts how can I give them access to the same volume?

It's still not completely clear how it all should work together. In the swarm example, first the seed service is started with one volume to /var/lib/mysql. After this seed node is healthy, the node service is scaled with 2 replicas. Once these are up the seed service is closed.

You state that the node and seed service need to have access to the same volume, but why is that?

Is this example suited for a multi host setup?

Verstuurd vanaf mijn iPhone

Op 13 sep. 2017 om 21:38 heeft Colin Mollenhour notifications@github.com het volgende geschreven:

Ahh, of course.. I wanted to use mysql_embedded but it doesn't seem to be present in the container for some reason. Probably just a 3 second sleep would be sufficient since it only happens on an empty database.

If you want to test the mysqld command you need to use gosu mysql before it. The entire mysqld.sh script is run with gosu so within the script gosu doesn't need to be run again but if you docker run or docker exec then you will be root and need to use gosu.

E.g.: gosu mysql mysqld --skip-networking --skip-grant-tables --socket=/tmp/mysql.sock &

I've not used Docker Swarm in a while so I don't know what the issue with volumes. You do want the node and the seed to use the same volume though.. You could try a mount point like '/var/lib/mysql:/var/lib/mysql` but if that works and the local volume doesn't I don't know why offhand.

I pushed another commit fixing the file permissions and adding a 3 second delay before loading tzinfo.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

colinmollenhour commented 6 years ago

Yes this is for a multi-node setup. For example during initial bootstrap:

Node1 -> seed Node2 -> node Node3 -> node

After bootstrap is complete shut down "seed" and start "node" on Node1:

Node1 -> node Node2 -> node Node3 -> node

So "seed" and "node" should only share the volume on Node1, but they should not both be run at the same time on the same node.

javashop commented 5 years ago

i have the same problem,and i solve it like this:

  seed:
    image: colinmollenhour/mariadb-galera-swarm
    volumes:
      - mysql-data:/var/lib/mysql
  node:
    image: colinmollenhour/mariadb-galera-swarm
    volumes:
      - mysql-data:/var/lib/mysql
volumes:
  mysql-data:
    name: '{{.Service.Name}}-{{.Task.Slot}}-data'
    driver: local