BCDevOps / backup-container

A simple container for a simple backup strategy.
Apache License 2.0
39 stars 56 forks source link

MySQL verification fails - unable to start server #82

Open esune opened 3 years ago

esune commented 3 years ago

When trying to verify a MySQL backup:

sh-5.0$ ./backup.sh -v mariadb=matomo-db:3306/matomo

Verifying backup ...

Settings:
- Database: mariadb=matomo-db:3306/matomo
- Backup file: /backups/daily/2021-07-27/matomo-db-matomo_2021-07-27_05-40-37.sql.gz

Would you like to continue?  Press Ctrl-C to exit, or any other key to continue ...

Cleaning up ...

waiting for server to start..............................................................................................................................................................................................................................................................................................................................................................
The server failed to start within 0h:6m:0s.

The server started in 0h:6m:0s.

Warning: World-writable config file '/etc/my.cnf.d/40-paas.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/50-my-tuning.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/base.cnf' is ignored
mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)'
Check that mysqld is running and that the socket: '/var/lib/mysql/mysql.sock' exists!
Already clean ...

[!!ERROR!!] - Backup verification failed: /backups/daily/2021-07-27/matomo-db-matomo_2021-07-27_05-40-37.sql.gz

Elapsed time: 0h:6m:0s - Status Code: 1

This used to work at the time of the release, so it might be a change in the underlying MySQL image.

wadeking98 commented 3 years ago

hmm, I can confirm that the sock file doesn't exist. Might be an issue with the installation?

esune commented 3 years ago

I don't know, honestly. When we released version 2.3.0 this used to work - I tested it at that time (see here).

wadeking98 commented 3 years ago

https://stackoverflow.com/questions/4448467/cant-connect-to-local-mysql-server-through-socket-var-lib-mysql-mysql-sock localhost seems to be the issue. When localhost is used mysql tries to connect through the daemon sock file (which doesn't exist). Changing it to 127.0.0.1 may resolve the issue

WadeBarnes commented 3 years ago

I find it strange that the database can't even start using the startup scripts supplied with the container. Is an unmodified version of the mariadb container able to startup in OpenShift?

In our case matomo-db is using registry.access.redhat.com/rhscl/mariadb-102-rhel7:latest, while the backup container is using registry.fedoraproject.org/f31/mariadb, so it would be interesting to know whether or not registry.fedoraproject.org/f31/mariadb runs into the same startup issues we're seeing in the backup container if it's just deployed as a database server in OCP.

WadeBarnes commented 3 years ago

I pulled the latest registry.fedoraproject.org/f31/mariadb and it starts up fine in docker. That image is 16 months old.

WadeBarnes commented 3 years ago

My theory is registry.fedoraproject.org/f31/mariadb has never been able to start in OCP4. That should be the next thing to verify. As a test you could modify the matomo-db configurations to create a new instance that uses registry.fedoraproject.org/f31/mariadb rather than registry.access.redhat.com/rhscl/mariadb-102-rhel7:latest, and see if it can even start.

WadeBarnes commented 3 years ago

https://stackoverflow.com/questions/4448467/cant-connect-to-local-mysql-server-through-socket-var-lib-mysql-mysql-sock localhost seems to be the issue. When localhost is used mysql tries to connect through the daemon sock file (which doesn't exist). Changing it to 127.0.0.1 may resolve the issue

In this case the error:

mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)'

Is more of a result of the server not starting in the first place The server failed to start within 0h:6m:0s., and then the verification being attempted anyway.

WadeBarnes commented 3 years ago

The underlying issue is the fact the server is not starting in the first place:

sh-5.0$ rm -rf /var/lib/mysql/data/*
sh-5.0$ /usr/libexec/s2i/run
=> sourcing 20-validate-variables.sh ...
=> sourcing 25-validate-replication-variables.sh ...
=> sourcing 30-base-config.sh ...
---> 06:56:32     Processing basic MySQL configuration files ...
=> sourcing 60-replication-config.sh ...
=> sourcing 70-s2i-config.sh ...
---> 06:56:32     Processing additional arbitrary  MySQL configuration provided by s2i ...
=> sourcing 40-paas.cnf ...
=> sourcing 50-my-tuning.cnf ...
---> 06:56:32     Initializing database ...
---> 06:56:32     Running mysql_install_db ...
Warning: World-writable config file '/etc/my.cnf.d/40-paas.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/50-my-tuning.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/base.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/40-paas.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/50-my-tuning.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/base.cnf' is ignored
2021-07-28  6:56:32 0 [Note] Using unique option prefix 'ignore-db-dir' is error-prone and can break in the future. Please use the full name 'ignore_db_dirs' instead.
2021-07-28  6:56:32 0 [Note] Using unique option prefix 'ignore-db-dir' is error-prone and can break in the future. Please use the full name 'ignore_db_dirs' instead.

PLEASE REMEMBER TO SET A PASSWORD FOR THE MariaDB root USER !
To do so, start the server, then issue the following commands:

'/usr/bin/mysqladmin' -u root password 'new-password'
'/usr/bin/mysqladmin' -u root -h backup-mariadb-32-z4w2v password 'new-password'

Alternatively you can run:
'/usr/bin/mysql_secure_installation'

which will also give you the option of removing the test
databases and anonymous user created by default.  This is
strongly recommended for production servers.

See the MariaDB Knowledgebase at http://mariadb.com/kb or the
MySQL manual for more instructions.

Please report any problems at http://mariadb.org/jira

The latest information about MariaDB is available at http://mariadb.org/.
You can find additional information about the MySQL part at:
http://dev.mysql.com
Consider joining MariaDB's strong and vibrant community:
https://mariadb.org/get-involved/

---> 06:56:34     Starting MySQL server with disabled networking ...
---> 06:56:35     Waiting for MySQL to start ...
Warning: World-writable config file '/etc/my.cnf.d/40-paas.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/50-my-tuning.cnf' is ignored
Warning: World-writable config file '/etc/my.cnf.d/base.cnf' is ignored
2021-07-28  6:56:35 0 [Note] Using unique option prefix 'ignore-db-dir' is error-prone and can break in the future. Please use the full name 'ignore_db_dirs' instead.
2021-07-28  6:56:35 0 [Note] Using unique option prefix 'ignore-db-dir' is error-prone and can break in the future. Please use the full name 'ignore_db_dirs' instead.
2021-07-28  6:56:35 0 [Note] /usr/libexec/mysqld (mysqld 10.3.22-MariaDB) starting as process 233 ...
2021-07-28  6:56:35 0 [Note] InnoDB: Using Linux native AIO
2021-07-28  6:56:35 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2021-07-28  6:56:35 0 [Note] InnoDB: Uses event mutexes
2021-07-28  6:56:35 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-07-28  6:56:35 0 [Note] InnoDB: Number of pools: 1
2021-07-28  6:56:35 0 [Note] InnoDB: Using SSE2 crc32 instructions
2021-07-28  6:56:35 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2021-07-28  6:56:35 0 [Note] InnoDB: Completed initialization of buffer pool
2021-07-28  6:56:35 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2021-07-28  6:56:35 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=48580
2021-07-28  6:56:35 0 [Note] InnoDB: Starting final batch to recover 13 pages from redo log.
2021-07-28  6:56:35 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
2021-07-28  6:56:35 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2021-07-28  6:56:35 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2021-07-28  6:56:35 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2021-07-28  6:56:35 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2021-07-28  6:56:35 0 [Note] InnoDB: 10.3.22 started; log sequence number 1601328; transaction id 9
2021-07-28  6:56:35 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2021-07-28  6:56:35 0 [Note] InnoDB: Cannot open '/var/lib/mysql/ib_buffer_pool' for reading: No such file or directory
2021-07-28  6:56:35 0 [Note] Plugin 'FEEDBACK' is disabled.
2021-07-28  6:56:35 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded
2021-07-28  6:56:35 0 [Note] Recovering after a crash using tc.log
2021-07-28  6:56:35 0 [Note] Starting crash recovery...
2021-07-28  6:56:35 0 [Note] Crash recovery finished.
2021-07-28  6:56:35 0 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist
2021-07-28  6:56:35 0 [ERROR] Fatal error: Can't open and lock privilege tables: Table 'mysql.user' doesn't exist
WadeBarnes commented 3 years ago

Other container options:

wadeking98 commented 2 years ago

Can this issue be closed now?

esune commented 2 years ago

I don't think we ever got around investigating further, and fixing, this.

WadeBarnes commented 1 year ago

Some further investigation into this:

The underlying issue is the permissions of the files in the /etc/my.cnf.d/ folder, leading to the Warning: World-writable config file '/etc/my.cnf.d/<filename>' is ignored messages. The fact that these files are being ignored causes the startup failures. No files exist in this folder ahead of time, they are all created dynamically by the container's run-mysqld scripts.

The registry.fedoraproject.org/f31/mariadb container starts up on it's own in OCP just fine. In fact if you add the required MYSQL_* environment variables to a backup-container deployment and comment out the CMD sh /backup.sh line at the end of the Dockerfile_MariaDB file, you can then rebuild the container and deploy it as a functional DB server.

In this case the file permissions look like this:

sh-5.0$ cd /etc/my.cnf.d/
sh-5.0$ ls -last
total 12
4 -rw-r--r--. 1 1004450000 root 587 Dec 29 09:01 50-my-tuning.cnf
0 drwxrwxr-x. 1 mysql      root  65 Dec 29 09:01 .
4 -rw-r--r--. 1 1004450000 root 887 Dec 29 09:01 40-paas.cnf
4 -rw-r--r--. 1 1004450000 root  92 Dec 29 09:01 base.cnf
0 drwxr-xr-x. 1 root       root  22 Dec 29 08:59 ..

However, when the base image is used to run the backup-container scripts at startup the behavior of the run-mysqld scripts changes and the files in the /etc/my.cnf.d/ folder are created with different permissions. When the run-mysqld scripts are executed from the terminal directly or under the control of the ./backup.sh -v command the file permissions end up liooking like this (writable by all):

bash-5.0$ cd /etc/my.cnf.d/
bash-5.0$ ls -last
total 12
0 drwxrwxr-x. 1 mysql      root  65 Dec 29 09:16 .
4 -rw-rw-rw-. 1 1004450000 root 587 Dec 29 09:16 50-my-tuning.cnf
4 -rw-rw-rw-. 1 1004450000 root 887 Dec 29 09:16 40-paas.cnf
4 -rw-rw-rw-. 1 1004450000 root  92 Dec 29 09:16 base.cnf
0 drwxr-xr-x. 1 root       root  22 Dec 28 08:06 ..

If the permissions are changed to remove write from o (chmod o-w *.cnf), the files are NOT ignored. However it's too late at this point as there is an initialization step that runs first with a different set of *.cnf files which has already failed to execute.

It appears that the first command that executes in the container executes with different permissions than all subsequent commands. If the run-mysqld scripts could be run with the same permissions as the initial commands the files would be written with the correct permissions and the DB server would be able to startup under the control of the backup container.

WadeBarnes commented 1 year ago

Source for the container is here; https://github.com/sclorg/mariadb-container