percona / pmm

Percona Monitoring and Management: an open source database monitoring, observability and management tool
https://www.percona.com/software/database-tools/percona-monitoring-and-management
GNU Affero General Public License v3.0
610 stars 123 forks source link

PMM cant restore xtradb from backup #2340

Open artemsafiyulin opened 1 year ago

artemsafiyulin commented 1 year ago

Description

I installed test xtradb cluster (5.7 version) using this instruction: https://docs.percona.com/percona-xtradb-cluster/5.7/overview.html

Install PMM server and pmm agent (on each mysql node), register service from each node, using this instruction: https://docs.percona.com/percona-monitoring-and-management/setting-up/client/mysql.html

Install xtrabackup 2.4 on all mysql nodes using this instruction: https://docs.percona.com/percona-xtrabackup/2.4/installation/yum_repo.html

now i have next installation: node01 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1 node02 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1 node03 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1 node 04 - percona/pmm-server:2 in docker

After all preparing i start backup from node01 using web interface. Backup complete without problems. After creating backup i start restore them to node01 using web interface and get error. Web interface dont get information about error. But li /var/log/mesages i see next errors:

Jul  3 12:55:50 node01 systemd[1]: Stopping Percona XtraDB Cluster...
Jul  3 12:55:50 node01 mysql-systemd[1279656]: SUCCESS! Stopping Percona XtraDB Cluster......
Jul  3 12:56:00 node01 pmm-agent[1172959]: #033[33mWARN#033[0m[2023-07-03T12:56:00.196+00:00] Job terminated 
with error: signal: killed
Jul  3 12:56:00 node01 pmm-agent[1172959]: waiting systemctl stop command failed
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner/jobs.stopMySQL
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_restore_job.go:290
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner/jobs.(*MySQLRestoreJob).Run
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_restore_job.go:121
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner.(*Runner).handleJob.func1
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/runner.go:185
Jul  3 12:56:00 node01 pmm-agent[1172959]: runtime/pprof.Do
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/usr/local/go/src/runtime/pprof/runtime.go:40
Jul  3 12:56:00 node01 pmm-agent[1172959]: runtime.goexit
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/usr/local/go/src/runtime/asm_amd64.s:1594
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner/jobs.(*MySQLRestoreJob).Run
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_restore_job.go:122
Jul  3 12:56:00 node01 pmm-agent[1172959]: github.com/percona/pmm/agent/runner.(*Runner).handleJob.func1
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/runner.go:185
Jul  3 12:56:00 node01 pmm-agent[1172959]: runtime/pprof.Do
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/usr/local/go/src/runtime/pprof/runtime.go:40
Jul  3 12:56:00 node01 pmm-agent[1172959]: runtime.goexit
Jul  3 12:56:00 node01 pmm-agent[1172959]: #011/usr/local/go/src/runtime/asm_amd64.s:1594  #033[33mcomponent#033[0m=runner #033[33mid#033[0m=/job_id/90a32d1b-4e89-4cc4-bd4d-36abe52c0da7 #033[33mtype#033[0m=mysql_restore
Jul  3 12:56:02 node01 pmm-agent[1172959]: [mysql] 2023/07/03 12:56:02 packets.go:122: closing bad idle connection: EOF

I using almalinux 8.7 on all servers. And also settings same solution for backuping mongoDB (the work good).

Can you help me with this problem?

Expected Results

I expect that backup restore correctly.

Actual Results

Backup restore with errors.

Version

node01 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1 node02 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1 node03 - xtradb 5.7 / xtrabackup2.4 / pmm-agent2 2.37.1 node 04 - percona/pmm-server:2 in docker

Steps to reproduce

install xtradb 5.7 cluster with 3 nodes, create backup using PMM web interface, try restore backup

Relevant logs

No response

Code of Conduct

artemgavrilov commented 1 year ago

Hi @artemsafiyulin , please check that your setup met required prerequisites described here: https://docs.percona.com/percona-monitoring-and-management/get-started/backup/mysql_prerequisites.html

P.S. Please consider that MySQL backups/restores still in technical preview and not ready for use in production.

artemsafiyulin commented 1 year ago

@artemgavrilov hi! Yes my setup required all prerequisites from this link.

I did not notice that this solution is not ready for use yet. Apparently you will have to create backups in a different way. Thank you for the information!

artemgavrilov commented 1 year ago

@artemsafiyulin We just released PMM v2.38.0 and I see that there was fixed bug that looks similar to yours: https://jira.percona.com/browse/PMM-11645

If you still want to play with PMM Backups you can try new PMM version (fix was done on the pmm-agent side). We will appreciate any feedback.

artemsafiyulin commented 1 year ago

@artemgavrilov Hi! I try with new version PMM agent, but get same error:

Jul  6 11:35:10 node-01 systemd[1]: Stopping Percona XtraDB Cluster...
Jul  6 11:35:10 node-01 mysql-systemd[1353462]: SUCCESS! Stopping Percona XtraDB Cluster......
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #033[33mWARN#033[0m[2023-07-06T11:35:20.435+00:00] Job terminated 
with error: signal: killed
Jul  6 11:35:20 node-01 pmm-agent[1349262]: waiting systemctl stop command failed
Jul  6 11:35:20 node-01 pmm-agent[1349262]: github.com/percona/pmm/agent/runner/jobs.stopMySQL
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_res
tore_job.go:298
Jul  6 11:35:20 node-01 pmm-agent[1349262]: github.com/percona/pmm/agent/runner/jobs.(*MySQLRestoreJob).Run
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_restore_job.go:124
Jul  6 11:35:20 node-01 pmm-agent[1349262]: github.com/percona/pmm/agent/runner.(*Runner).handleJob.func1
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/runner.go:185
Jul  6 11:35:20 node-01 pmm-agent[1349262]: runtime/pprof.Do
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/usr/local/go/src/runtime/pprof/runtime.go:44
Jul  6 11:35:20 node-01 pmm-agent[1349262]: runtime.goexit
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/usr/local/go/src/runtime/asm_amd64.s:1598
Jul  6 11:35:20 node-01 pmm-agent[1349262]: github.com/percona/pmm/agent/runner/jobs.(*MySQLRestoreJob).Run
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/jobs/mysql_restore_job.go:125
Jul  6 11:35:20 node-01 pmm-agent[1349262]: github.com/percona/pmm/agent/runner.(*Runner).handleJob.func1
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/tmp/go/src/github.com/percona/pmm/agent/runner/runner.go:185
Jul  6 11:35:20 node-01 pmm-agent[1349262]: runtime/pprof.Do
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/usr/local/go/src/runtime/pprof/runtime.go:44
Jul  6 11:35:20 node-01 pmm-agent[1349262]: runtime.goexit
Jul  6 11:35:20 node-01 pmm-agent[1349262]: #011/usr/local/go/src/runtime/asm_amd64.s:1598  #033[33mcomponent#033[0m=runner #033[33mid#033[0m=/job_id/4c0eb874-c9b6-4610-8964-8d46fe541f9d #033[33mtype#033[0m=mysql_restore
Jul  6 11:35:23 node-01 pmm-agent[1349262]: [mysql] 2023/07/06 11:35:23 packets.go:122: closing bad idle connection: EOF

As i can see in ticket that you attach problem was in getMysqlServiceName function, but in my case problem with jobs.stopMySQL