bitpoke / mysql-operator

Asynchronous MySQL Replication on Kubernetes using Percona Server and Openark's Orchestrator.
https://www.bitpoke.io/docs/mysql-operator/getting-started/
Apache License 2.0
1.03k stars 275 forks source link

[MySQL 8.0] Init Container issue #363

Closed imriss closed 3 years ago

imriss commented 5 years ago

I am trying to deploy a MySQL 8.0 cluster using the MySQL Operator. However, I am getting these errors in the logs of the Mysql container:

kubectl logs -f mysql-cluster-test-mysql-0 mysql -n test

mysqld: Table 'mysql.plugin' doesn't exist
2019-07-03T18:16:51.902943Z 0 [Warning] [MY-011068] [Server] The syntax 'expire-logs-days' is deprecated and will be removed in a future release. Please use binlog_expire_logs_seconds instead.
2019-07-03T18:16:51.904587Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.15-6) starting as process 1
2019-07-03T18:16:52.461051Z 1 [Warning] [MY-010005] [Server] Skip re-populating collations and character sets tables in read-only mode.
2019-07-03T18:16:52.463437Z 0 [ERROR] [MY-010735] [Server] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
2019-07-03T18:16:52.562848Z 0 [Warning] [MY-010015] [Repl] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
2019-07-03T18:16:52.565228Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2019-07-03T18:16:52.567938Z 0 [Warning] [MY-010441] [Server] Failed to open optimizer cost constant tables
2019-07-03T18:16:52.568079Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we're sending the information to the error-log instead: MY-001146 - Table 'mysql.component' doesn't exist
2019-07-03T18:16:52.568092Z 0 [Warning] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we're sending the information to the error-log instead: MY-003543 - The mysql.component table is missing or has an incorrect definition.
2019-07-03T18:16:52.568181Z 0 [ERROR] [MY-000067] [Server] unknown variable 'query-cache-size=0'.
2019-07-03T18:16:52.568190Z 0 [Warning] [MY-010952] [Server] The privilege system failed to initialize correctly. If you have upgraded your server, make sure you're executing mysql_upgrade to correct the issue.
2019-07-03T18:16:52.568549Z 0 [ERROR] [MY-010119] [Server] Aborting
2019-07-03T18:16:54.635552Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.15-6)  Percona Server (GPL), Release 6, Revision 63abd08.

The version of MySQL is set to 8.0.15.

Thanks

imriss commented 5 years ago

The init init container logs:

kubectl logs -f mysql-cluster-test-mysql-0 init -n test

Create rclone.conf file.
Create google-credentials.json file.
2019-07-03T19:26:29.393Z        INFO    sidecar environment is not set  {"key": "INIT_BUCKET_URI"}
2019-07-03T19:26:29.394Z        INFO    sidecar cloning command {"host": "mysql-cluster-test-mysql-0"}
2019-07-03T19:26:29.395Z        INFO    sidecar nothing to clone or init from
2019-07-03T19:26:29.395Z        INFO    sidecar configuring server      {"host": "mysql-cluster-test-mysql-0"}
2019-07-03T19:26:29.395Z        INFO    sidecar error while reading PURGE GTID from xtrabackup info file        {"error": "open /var/lib/mysql/xtrabackup_binlog_info: no such file or directory"}
AMecea commented 5 years ago

Hello @imriss ,

What version of the operator do you use? The operator version v0.2.x is known to not work with MySQL 8.0. You have to use version v0.3.0 which is still in alpha but is more compatible with MySQL 8.0.

The init container seems to be ok. The last line informs you that there is not PURGE_GTID to set because is not initializer from a backup.

imriss commented 5 years ago

@AMecea , thanks for the info. I am using the master bracnh to build an image that has the percona for MySQL 8.0.15-6 added to the list of the version in pkg/util/constants/constants.go file. What is the branch name for v0.3.x? Thanks

imriss commented 5 years ago

The issue was resolved by removing these two lines from the pkg/controller/mysqlcluster/internal/syncer/config_map.go file:

    "query-cache-type": "0",
    "query-cache-size": "0",

However, now, the second pod (mysql-cluster-test-mysql-1) does not initiate showing this error in its init container's logs:

kubectl logs -f mysql-cluster-test-mysql-1 init -n test

Create rclone.conf file.
Create google-credentials.json file.
2019-07-04T13:56:12.581Z        INFO    sidecar environment is not set  {"key": "INIT_BUCKET_URI"}
2019-07-04T13:56:12.581Z        INFO    sidecar cloning command {"host": "mysql-cluster-test-mysql-1"}
2019-07-04T13:56:12.581Z        INFO    sidecar cloning from node       {"host": "mysql-cluster-test-mysql-0.mysql.test"}
2019-07-04T13:56:12.581Z        INFO    sidecar initialize a backup     {"host": "mysql-cluster-test-mysql-0.mysql.test", "endpoint": "/xbackup"}
2019-07-04T13:56:14.493Z        ERROR   sidecar clone command failed    {"error": "failed to clone from mysql-cluster-test-mysql-0.mysql.test, err: fail to get backup: fail to get backup: %!s(<nil>), code: 500 Internal Server Error"}
github.com/presslabs/mysql-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error
        /go/src/github.com/presslabs/mysql-operator/vendor/github.com/go-logr/zapr/zapr.go:128
main.main.func2
        /go/src/github.com/presslabs/mysql-operator/cmd/mysql-operator-sidecar/main.go:67
github.com/presslabs/mysql-operator/vendor/github.com/spf13/cobra.(*Command).execute
        /go/src/github.com/presslabs/mysql-operator/vendor/github.com/spf13/cobra/command.go:766
github.com/presslabs/mysql-operator/vendor/github.com/spf13/cobra.(*Command).ExecuteC
        /go/src/github.com/presslabs/mysql-operator/vendor/github.com/spf13/cobra/command.go:852
github.com/presslabs/mysql-operator/vendor/github.com/spf13/cobra.(*Command).Execute
        /go/src/github.com/presslabs/mysql-operator/vendor/github.com/spf13/cobra/command.go:800
main.main
        /go/src/github.com/presslabs/mysql-operator/cmd/mysql-operator-sidecar/main.go:111
runtime.main
        /usr/local/go/src/runtime/proc.go:201
AMecea commented 5 years ago

The master branch is fine.

Please show me the logs from pod 0 in the sidecar container.

What other issues did you encounter with MySQL 8?

imriss commented 5 years ago

The logs of the side care from pod 0:

kubectl logs -f mysql-cluster-test-mysql-0 sidecar -n test

Create rclone.conf file.
Create google-credentials.json file.
2019-07-04T16:23:29.087Z        INFO    sidecar environment is not set  {"key": "INIT_BUCKET_URI"}
2019-07-04T16:23:29.088Z        INFO    sidecar start http server for backups
xtrabackup: recognized server arguments: --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=2 --innodb_flush_method=O_DIRECT --innodb_log_files_in_group=2 --log_bin=/var/lib/mysql/mysql-bin --open_files_limit=65535 --innodb_buffer_pool_size=536870912 --innodb_log_file_size=134217728 --server-id=100 
xtrabackup: recognized client arguments: --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=2 --innodb_flush_method=O_DIRECT --innodb_log_files_in_group=2 --log_bin=/var/lib/mysql/mysql-bin --open_files_limit=65535 --innodb_buffer_pool_size=536870912 --innodb_log_file_size=134217728 --server-id=100 --backup=1 --slave-info=1 --stream=xbstream --tables-exclude=sys_operator.status --host=127.0.0.1 --user=sys_replication --password=* --target-dir=/tmp/xtrabackup_backupfiles/ 
190704 16:24:20  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;host=127.0.0.1' as 'sys_replication'  (using password: YES).
190704 16:24:20  version_check Connected to MySQL server
190704 16:24:20  version_check Executing a version check against the server...

# A software update is available:
190704 16:24:22  version_check Done.
190704 16:24:22 Connecting to MySQL server host: 127.0.0.1, user: sys_replication, password: set, port: not set, socket: not set
Error: Unsupported server version: '8.0.15-6'. Please report a bug at https://jira.percona.com/projects/PXB
2019-07-04T16:24:22.891Z        ERROR   sidecar failed waiting for xtrabackup to finish {"error": "exit status 1"}
github.com/presslabs/mysql-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error
        /go/src/github.com/presslabs/mysql-operator/vendor/github.com/go-logr/zapr/zapr.go:128
github.com/presslabs/mysql-operator/pkg/sidecar.(*server).backupHandler
        /go/src/github.com/presslabs/mysql-operator/pkg/sidecar/server.go:126
github.com/presslabs/mysql-operator/pkg/sidecar.(*server).backupHandler-fm
        /go/src/github.com/presslabs/mysql-operator/pkg/sidecar/server.go:53
net/http.HandlerFunc.ServeHTTP
        /usr/local/go/src/net/http/server.go:1964
github.com/presslabs/mysql-operator/pkg/sidecar.maxClients.func1
        /go/src/github.com/presslabs/mysql-operator/pkg/sidecar/server.go:150
net/http.HandlerFunc.ServeHTTP
        /usr/local/go/src/net/http/server.go:1964
net/http.(*ServeMux).ServeHTTP
        /usr/local/go/src/net/http/server.go:2361
net/http.serverHandler.ServeHTTP
        /usr/local/go/src/net/http/server.go:2741
net/http.(*conn).serve
        /usr/local/go/src/net/http/server.go:1847

The error repeats almost every minute. Thanks

imriss commented 5 years ago

The logs of pt-heartbeat from pod 0:

kubectl logs -f mysql-cluster-test-mysql-0 pt-heartbeat -n test

Create rclone.conf file.
Create google-credentials.json file.
Usage: /usr/local/bin/sidecar-entrypoint.sh {files-config|clone|config-and-serve}
Now runs your command.
pt-heartbeat --update --replace --check-read-only --create-table --database sys_operator --table heartbeat --defaults-file /etc/mysql/heartbeat.conf --fail-successive-errors=20
imriss commented 5 years ago

The logs of metrics-exporter from pod 0:

kubectl logs -f mysql-cluster-test-mysql-0 metrics-exporter -n test

time="2019-07-04T16:23:32Z" level=info msg="Starting mysqld_exporter (version=0.11.0, branch=HEAD, revision=5d7179615695a61ecc3b5bf90a2a7c76a9592cdd)" source="mysqld_exporter.go:206"
time="2019-07-04T16:23:32Z" level=info msg="Build context (go=go1.10.3, user=root@3d3ff666b0e4, date=20180629-15:00:35)" source="mysqld_exporter.go:207"
time="2019-07-04T16:23:32Z" level=info msg="Enabled scrapers:" source="mysqld_exporter.go:218"
time="2019-07-04T16:23:32Z" level=info msg=" --collect.global_status" source="mysqld_exporter.go:222"
time="2019-07-04T16:23:32Z" level=info msg=" --collect.global_variables" source="mysqld_exporter.go:222"
time="2019-07-04T16:23:32Z" level=info msg=" --collect.slave_status" source="mysqld_exporter.go:222"
time="2019-07-04T16:23:32Z" level=info msg=" --collect.info_schema.tables" source="mysqld_exporter.go:222"
time="2019-07-04T16:23:32Z" level=info msg=" --collect.heartbeat" source="mysqld_exporter.go:222"
time="2019-07-04T16:23:32Z" level=info msg="Listening on 0.0.0.0:9125" source="mysqld_exporter.go:232"
time="2019-07-04T16:23:44Z" level=error msg="Error scraping for collect.heartbeat: Error 1146: Table 'sys_operator.heartbeat' doesn't exist" source="exporter.go:138"
time="2019-07-04T16:23:59Z" level=error msg="Error scraping for collect.heartbeat: Error 1146: Table 'sys_operator.heartbeat' doesn't exist" source="exporter.go:138"
imriss commented 5 years ago

This is the reply I got from Percona via this JIRA:

Please try xtrabackup 8.0.6. MySQL 8 will not be supported in 2.4.x series.
https://www.percona.com/blog/2019/05/09/percona-xtrabackup-8-0-6-is-now-available/
imriss commented 5 years ago

After changing the PXB version in the sidecar, the first error has been resolved. However, there is this error in the sidecar logs: Access denied; you need (at least one of) the BACKUP_ADMIN privilege(s) for this operation

The logs:

190704 17:21:15 [01] Streaming performance_schema/file_summary_by__97.sdi to <STDOUT>
190704 17:21:15 [01]        ...done
190704 17:21:15 [01] Skipping sys_operator/status.ibd.
190704 17:21:15 Finished backing up non-InnoDB tables and files
190704 17:21:15 Executing FLUSH NO_WRITE_TO_BINLOG BINARY LOGS
190704 17:21:15 Selecting LSN and binary log position from p_s.log_status
Error: failed to fetch query result SELECT server_uuid, local, replication, storage_engines FROM performance_schema.log_status: Access denied; you need (at least one of) the BACKUP_ADMIN privilege(s) for this operation
2019-07-04T17:21:15.903Z        ERROR   sidecar failed waiting for xtrabackup to finish {"error": "exit status 1"}
github.com/presslabs/mysql-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error
        /go/src/github.com/presslabs/mysql-operator/vendor/github.com/go-logr/zapr/zapr.go:128
github.com/presslabs/mysql-operator/pkg/sidecar.(*server).backupHandler
        /go/src/github.com/presslabs/mysql-operator/pkg/sidecar/server.go:126
github.com/presslabs/mysql-operator/pkg/sidecar.(*server).backupHandler-fm
        /go/src/github.com/presslabs/mysql-operator/pkg/sidecar/server.go:53
net/http.HandlerFunc.ServeHTTP
        /usr/local/go/src/net/http/server.go:1964
github.com/presslabs/mysql-operator/pkg/sidecar.maxClients.func1
        /go/src/github.com/presslabs/mysql-operator/pkg/sidecar/server.go:150
net/http.HandlerFunc.ServeHTTP
        /usr/local/go/src/net/http/server.go:1964
net/http.(*ServeMux).ServeHTTP
        /usr/local/go/src/net/http/server.go:2361
net/http.serverHandler.ServeHTTP
        /usr/local/go/src/net/http/server.go:2741
net/http.(*conn).serve
        /usr/local/go/src/net/http/server.go:1847
2019/07/04 17:21:15 http: multiple response.WriteHeader calls
imriss commented 5 years ago

After adding BACKUP_ADMIN to the list of rights for the ReplicationUser in pkg/sidecar/appconf.go file, the previous error was resolved.

Now, the error is with Readiness probe of pod 1:

  Warning  Unhealthy               2m9s (x134 over 6m35s)  kubelet, ip-10-0-156-26.us-east-1.compute.internal  Readiness probe failed: ERROR 1812 (HY000) at line 1: Tablespace is missing for table `sys_operator`.`status`.
/bin/sh: line 0: test: -eq: unary operator expected
imriss commented 5 years ago

Some probably-related error messages in MySQL container of pod 1:

2019-07-04T18:07:07.992788Z 1 [Warning] [MY-012351] [InnoDB] Tablespace 2, name 'sys_operator/status', file './sys_operator/status.ibd' is missing!
2019-07-04T18:07:08.563943Z 6 [ERROR] [MY-012592] [InnoDB] Operating system error number 2 in a file operation.
2019-07-04T18:07:08.563966Z 6 [ERROR] [MY-012593] [InnoDB] The error means the system cannot find the path specified.
2019-07-04T18:07:08.563974Z 6 [ERROR] [MY-012216] [InnoDB] Cannot open datafile for read-only: './sys_operator/status.ibd' OS error: 71
2019-07-04T18:07:08.564165Z 6 [ERROR] [MY-000061] [Server] 1812  Tablespace is missing for table `sys_operator`.`status`..

The full logs:

2019-07-04T18:07:07.248070Z 0 [Warning] [MY-011068] [Server] The syntax 'expire-logs-days' is deprecated and will be removed in a future release. Please use binlog_expire_logs_seconds instead.
2019-07-04T18:07:07.249919Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.15-6) starting as process 1
2019-07-04T18:07:07.992788Z 1 [Warning] [MY-012351] [InnoDB] Tablespace 2, name 'sys_operator/status', file './sys_operator/status.ibd' is missing!
2019-07-04T18:07:08.040294Z 0 [System] [MY-010229] [Server] Starting crash recovery...
2019-07-04T18:07:08.052278Z 0 [System] [MY-010232] [Server] Crash recovery finished.
2019-07-04T18:07:08.115053Z 0 [Warning] [MY-010075] [Server] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 897f7b0d-9e86-11e9-9245-0e6ff32f5797.
2019-07-04T18:07:08.444061Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2019-07-04T18:07:08.563943Z 6 [ERROR] [MY-012592] [InnoDB] Operating system error number 2 in a file operation.
2019-07-04T18:07:08.563966Z 6 [ERROR] [MY-012593] [InnoDB] The error means the system cannot find the path specified.
2019-07-04T18:07:08.563974Z 6 [ERROR] [MY-012216] [InnoDB] Cannot open datafile for read-only: './sys_operator/status.ibd' OS error: 71
2019-07-04T18:07:08.564165Z 6 [ERROR] [MY-000061] [Server] 1812  Tablespace is missing for table `sys_operator`.`status`..
2019-07-04T18:07:08.564399Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.15-6'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Percona Server (GPL), Release 6, Revision 63abd08.
2019-07-04T18:07:08.767924Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: '/var/lib/mysql/mysqlx.sock' bind-address: '::' port: 33060
imriss commented 5 years ago

The file sys_operator/status.ibd exists in pod 0, but it is missing in pod 1.

AMecea commented 5 years ago

Nice work so far!

As far as I can see the MySQL does not run init-file. Check the logs of the init container. (kubectl logs <pod> -c mysql-init-only). You can, also, upgrade XtraBackup to the latest version by changing the Dockerfile.sidecar.

Thank you for trying to make the operator compatible with MySQL 8.0. Please add here or in a different issue all the problems that you will encounter and please open a PR, I will be more than happy to review it and merge it!

imriss commented 5 years ago

@AMecea , thanks

Most of the issues are logged here, except one that was in this issue.

I've opened a JIRA for the sys_operator/status.ibd issue.

Hopefully, I can submit the PR soon.

imriss commented 5 years ago

@AMecea for the PR, how can I create a branch? Thanks

AMecea commented 5 years ago

Hi @imriss, you don't have to create a new branch, just open a PR from your fork. Here are some details.

Also, you have to rebase from master branch.