codership / mysql-wsrep

wsrep API patch for MySQL server
Other
65 stars 34 forks source link

MySQL wsrep 8.0.23-26.6 on Ubuntu 20.04 - Permission denied #391

Open shinguz opened 3 years ago

shinguz commented 3 years ago

We found today during a MySQL Galera 8.0 Cluster training that 8.0.23-26.6 throws errors on Ubuntu 20.04 and we were not able at all to run the Galera Cluster beyond the bootsrapped first node. Error below.

An alternative test on CentOS 7 with the same MySQL wsrep version worked as expected. So we strongly believe that this is really an Ubuntu 20.04 MySQL wsrep DEB issue!

It was easily reproducible. I will keep the test systems for a while just in case you want some more infos...

3 excerpts with different sst methods:

2021-06-23T13:29:06.477960Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.23-26.6) MySQL Wsrep Server - GPL. 2021-06-23T13:29:06.897047Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.23-26.6) starting as process 65818 2021-06-23T13:29:06.942983Z 0 [Warning] [MY-000000] [WSREP] P: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory) 2021-06-23T13:29:07.955008Z 0 [ERROR] [MY-000000] [WSREP] posix_spawnp(wsrep_sst_rsync --role 'joiner' --address '172.31.4.200' --datadir '/var/lib/mysql/' --parent '65818' --mysqld-version '8.0.23-26.6' '' '') failed: 13 (Permission denied) 2021-06-23T13:29:07.955433Z 0 [ERROR] [MY-000000] [WSREP] Failed to execute: wsrep_sst_rsync --role 'joiner' --address '172.31.4.200' --datadir '/var/lib/mysql/' --parent '65818' --mysqld-version '8.0.23-26.6' '' '' : 13 (Permission denied) 2021-06-23T13:29:07.955492Z 1 [ERROR] [MY-000000] [WSREP] Failed to prepare for 'rsync' SST. Unrecoverable. 2021-06-23T13:29:07.955638Z 1 [ERROR] [MY-000000] [WSREP] P: SST request callback failed. This is unrecoverable, restart required. 2021-06-23T13:29:09.649909Z 0 [Note] [MY-010949] [Server] Basedir set to /usr/. 2021-06-23T13:29:09.649924Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.23-26.6) starting as process 65878


2021-06-23T13:35:21.241106Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.23-26.6) MySQL Wsrep Server - GPL. 2021-06-23T13:35:21.643452Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.23-26.6) starting as process 73780 2021-06-23T13:35:21.677464Z 0 [Warning] [MY-000000] [WSREP] P: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory) 2021-06-23T13:35:22.683489Z 0 [ERROR] [MY-000000] [WSREP] posix_spawnp(wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.31.4.200' --datadir '/var/lib/mysql/' --parent '73780' --mysqld-version '8.0.23-26.6' '' '') failed: 13 (Permission denied) 2021-06-23T13:35:22.683810Z 0 [ERROR] [MY-000000] [WSREP] Failed to execute: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.31.4.200' --datadir '/var/lib/mysql/' --parent '73780' --mysqld-version '8.0.23-26.6' '' '' : 13 (Permission denied) 2021-06-23T13:35:22.683866Z 1 [ERROR] [MY-000000] [WSREP] Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable. 2021-06-23T13:35:22.683969Z 1 [ERROR] [MY-000000] [WSREP] P: SST request callback failed. This is unrecoverable, restart required. 2021-06-23T13:35:24.150918Z 0 [Note] [MY-010949] [Server] Basedir set to /usr/.


2021-06-23T13:51:21.102639Z 7 [Note] [MY-000000] [WSREP] Running: 'wsrep_sst_mysqldump --address '172.31.15.153:3306' --port '3306' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:116' --local-gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:0' --server-id 1 ' 2021-06-23T13:51:21.103299Z 7 [ERROR] [MY-000000] [WSREP] posix_spawnp(wsrep_sst_mysqldump --address '172.31.15.153:3306' --port '3306' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:116' --local-gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:0' --server-id 1 ) failed: 13 (Permission denied) 2021-06-23T13:51:21.103332Z 7 [ERROR] [MY-000000] [WSREP] Try 1/3: 'wsrep_sst_mysqldump --address '172.31.15.153:3306' --port '3306' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:116' --local-gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:0' --server-id 1 ' failed: 13 (Permission denied) 2021-06-23T13:51:21.103419Z 0 [Note] [MY-000000] [WSREP] P: IST sender 116 -> 116 2021-06-23T13:51:22.104020Z 7 [ERROR] [MY-000000] [WSREP] posix_spawnp(wsrep_sst_mysqldump --address '172.31.15.153:3306' --port '3306' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:116' --local-gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:0' --server-id 1 ) failed: 13 (Permission denied) 2021-06-23T13:51:22.104081Z 7 [ERROR] [MY-000000] [WSREP] Try 2/3: 'wsrep_sst_mysqldump --address '172.31.15.153:3306' --port '3306' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:116' --local-gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:0' --server-id 1 ' failed: 13 (Permission denied) 2021-06-23T13:51:23.104694Z 7 [ERROR] [MY-000000] [WSREP] posix_spawnp(wsrep_sst_mysqldump --address '172.31.15.153:3306' --port '3306' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:116' --local-gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:0' --server-id 1 ) failed: 13 (Permission denied) 2021-06-23T13:51:23.104746Z 7 [ERROR] [MY-000000] [WSREP] Try 3/3: 'wsrep_sst_mysqldump --address '172.31.15.153:3306' --port '3306' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:116' --local-gtid 'a71a177e-d424-11eb-a575-9f1dcfe4c232:0' --server-id 1 ' failed: 13 (Permission denied)

shinguz commented 3 years ago

I forgot to mention: We also tried the newest 5.7 and had the same issue... and we disabled apparmor and rebooted the machines...

simon-schneider commented 3 years ago

The error doesn't seem to be Ubuntu-specific but rather a general problem with Debian-based systems. For each combination of operating system and software version, I bootstrapped one node and then tried to join a second node to it. Reliably and repeatably, this works on CentOS but fails on Ubuntu and Debian.

galera 4 / mysql-wsrep 8.0 galera 3 / mysql-wsrep 5.7
CentOS 8 no problem no problem
Debian 11 "bullseye" failure failure
Ubuntu 20.04 "Focal Fossil" failure failure

Here, failure means that the second node won't join the cluster. mysqld repeatedly starts, only to shutdown right away and print the above messages. The node contacts the cluster, negotiates SST and then fails to spawn a process. The restart loop is probably due to systemd.

As of now, I suspect that the error is independent of:

Attached as demonstration.zip are a Vagrantfile and scripts to spin up the virtual machines I used: Make your choice of operating system and software version, and vagrant creates a cluster of several several identical hosts for you. Every node comes with ports opened in the firewall, selinux/apparmor disabled, galera and mysql-wsrep installed and configured but not running.


Also, I looked for messages containing "posix_spawnp(...) failed", "Failed to execute: ...", and "Failed to prepare ... for SST" in the source code. The following table is a complete list.

message file:line function
posix_spawnp sql/wsrep_utils.cc:307 wsp::process::process
failed to execute sql/wsrep_sst.cc:421 sst_joiner_thread
failed to execute sql/wsrep_sst.cc:1002 sst_donor_thread
failed to prepare sql/wsrep_sst.cc:711 wsrep_sst_prepare
temeo commented 3 years ago

Hi @simon-schneider !

Thanks for the Vagrant scripts, they were very helpful in reproducing the issue.

It appears that this is already known issue, reported here https://github.com/codership/mysql-wsrep/issues/367. With the following change in provisioning, the Apparmor gets properly disabled for mysqld:

diff --git a/Vagrantfile b/Vagrantfile
index 9e60024..73da0d8 100644
--- a/Vagrantfile
+++ b/Vagrantfile
@@ -38,9 +38,11 @@ debian_script = <<~EOF

   # Disable apparmor to make sure that it doesn't interfere with the execution
   # of mysql-wsrep.
-  systemctl stop apparmor
-  systemctl disable apparmor
-
+  if ! [ -f /etc/apparmor.d/disable/usr.sbin.mysqld ]
+  then
+    ln -s /etc/apparmor.d/usr.sbin.mysqld /etc/apparmor.d/disable/usr.sbin.mysqld
+    apparmor_parser -R /etc/apparmor.d/usr.sbin.mysqld
+  fi
   # Open ports for mysql, galera, rsync
   for tcp_port in 3306 4567 4568 4444; do
     iptables --append INPUT --protocol tcp --match tcp --dport $tcp_port --source 10.0.0.0/24 --jump ACCEPT
simon-schneider commented 3 years ago

Thank you for the advice. With the apparmor profile disabled, the error no longer appears. I needed to move your inserted lines after calling apt-get install, but that only makes sense.

One exception: galera-3/mysql-wsrep-5.7 on Debian requires a manual installation of lsof on the joiner node. There is bug #57 addressing the same issue for mysql-wsrep-5.6, but it seems to have reappeared in 5.7.

I'ts a bit confusing that systemd stop/disable apparmor doesn't have the desired effect. Instead (and as mentioned in the Debian wiki), you have to supply parameters on the kernel command line. We had, in fact, suspected apparmor, but did not anticipate systemctl being the wrong tool to take it out of the equation.

The error was originally observed on machines in the amazon cloud, to which I currently don't have access. If one were paranoid, the fix should be confirmed in the original setup. Pending that, I'm happy to consider this a duplicate of #367.

shinguz commented 3 years ago

What further is confusing that we never had this issue in the past on Debian nor Ubuntu systems. So something new in Debian/Ubuntu or new in systemd? Or where we just so lucky that AppArmor was always disabled in the past? So this would be the new thing then...

srikanthjeeva commented 1 year ago

Faced the same issue,

Following temeo, had executed the following lines to solve,

sudo ln -s /etc/apparmor.d/usr.sbin.mysqld /etc/apparmor.d/disable/usr.sbin.mysqld ls -l /etc/apparmor.d/disable/usr.sbin.mysqld # make sure link is established sudo /etc/init.d/apparmor restart apparmor_parser -R /etc/apparmor.d/usr.sbin.mysqld

salmanjunaidc commented 7 months ago

Faced the same issue,

Following temeo, had executed the following lines to solve,

sudo ln -s /etc/apparmor.d/usr.sbin.mysqld /etc/apparmor.d/disable/usr.sbin.mysqld ls -l /etc/apparmor.d/disable/usr.sbin.mysqld # make sure link is established sudo /etc/init.d/apparmor restart apparmor_parser -R /etc/apparmor.d/usr.sbin.mysqld

Thanks bro. I was about to giveup.