Open chilumbugeorge opened 6 years ago
Hi, can you please paste the complete log output for the -execute
execution?
which MySQL version are you using?
Your first execution completes without error, by the way. This message: [error] close sync with err: sync is been closing...
is invoked by an underlying library gh-ost
uses, but does not indicate an application error. I understand seeing this message in the standard output is alarming.
Hi Shlomi. Thanks for your quick response.
I am using MySQL version 5.7. To be precise, here is the version:
percona-xtradb-cluster-server-5.7 5.7.17-29.20-3.trusty
So when i run the command without the --execute on the master, i get the following:
2018/04/16 14:28:08 binlogsyncer.go:79: [info] create BinlogSyncer with config {99999 mysql master_ip 3306 ghost false false <nil>}
2018/04/16 14:28:08 binlogsyncer.go:246: [info] begin to sync binlog from position (mysqld-bin.000006, 13113017)
2018/04/16 14:28:08 binlogsyncer.go:139: [info] register slave for master server master_ip:3306
2018/04/16 14:28:08 binlogsyncer.go:573: [info] rotate to (mysqld-bin.000006, 13113017)
# Migrating `test`.`tb1`; Ghost table is `test`.`_tb1_gho`
# Migrating nangang-vm-db-pxc-test-1:3306; inspecting nangang-vm-db-pxc-test-1:3306; executing on nangang-vm-db-pxc-test-1
# Migrating `test`.`tb1`; Ghost table is `test`.`_tb1_gho`
# Migrating nangang-vm-db-pxc-test-1:3306; inspecting nangang-vm-db-pxc-test-1:3306; executing on nangang-vm-db-pxc-test-1
# Migration started at Mon Apr 16 14:28:08 +0800 2018
# chunk-size: 2000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_connected=20; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.test.tb1.sock
# Migration started at Mon Apr 16 14:28:08 +0800 2018
# chunk-size: 2000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_connected=20; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.test.tb1.sock
Copy: 0/0 100.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysqld-bin.000006:13115079; State: migrating; ETA: due
CREATE TABLE `_tb1_gho` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`x` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
2018/04/16 14:28:08 binlogsyncer.go:107: [info] syncer is closing...
Copy: 0/6 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysqld-bin.000006:13115079; State: migrating; ETA: N/A
2018/04/16 14:28:09 binlogstreamer.go:47: [error] close sync with err: sync is been closing...
2018/04/16 14:28:09 binlogsyncer.go:122: [info] syncer is closed
# Done
And when i run with --execute, i get the following error:
# Migration started at Mon Apr 16 14:31:08 +0800 2018
# chunk-size: 2000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_connected=20; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.test.tb1.sock
Copy: 0/6 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysqld-bin.000006:13118634; State: migrating; ETA: N/A
Copy: 0/6 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysqld-bin.000006:13123049; State: migrating; ETA: N/A
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysqld-bin.000006:13123838; State: migrating; ETA: due
2018-04-16 14:31:10 ERROR Unable to acquire lock gh-ost.3739543.lock
2018-04-16 14:31:10 ERROR Unable to acquire lock gh-ost.3739543.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 1s(copy); streamer: mysqld-bin.000006:13128257; State: migrating; ETA: due
2018-04-16 14:31:11 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:11 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 1s(copy); streamer: mysqld-bin.000006:13132676; State: migrating; ETA: due
2018-04-16 14:31:12 ERROR Unable to acquire lock gh-ost.3739543.lock
2018-04-16 14:31:12 ERROR Unable to acquire lock gh-ost.3739543.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 4s(total), 1s(copy); streamer: mysqld-bin.000006:13137091; State: migrating; ETA: due
2018-04-16 14:31:13 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:13 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 5s(total), 1s(copy); streamer: mysqld-bin.000006:13141508; State: migrating; ETA: due
2018-04-16 14:31:14 ERROR Unable to acquire lock gh-ost.3739543.lock
2018-04-16 14:31:14 ERROR Unable to acquire lock gh-ost.3739543.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 6s(total), 1s(copy); streamer: mysqld-bin.000006:13145921; State: migrating; ETA: due
2018-04-16 14:31:15 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:15 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 7s(total), 1s(copy); streamer: mysqld-bin.000006:13150336; State: migrating; ETA: due
2018-04-16 14:31:16 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:16 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 8s(total), 1s(copy); streamer: mysqld-bin.000006:13154755; State: migrating; ETA: due
2018-04-16 14:31:17 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:17 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 9s(total), 1s(copy); streamer: mysqld-bin.000006:13159172; State: migrating; ETA: due
2018-04-16 14:31:18 ERROR Unable to acquire lock gh-ost.3739543.lock
2018-04-16 14:31:18 ERROR Unable to acquire lock gh-ost.3739543.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 10s(total), 1s(copy); streamer: mysqld-bin.000006:13163591; State: migrating; ETA: due
2018-04-16 14:31:19 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:19 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 11s(total), 1s(copy); streamer: mysqld-bin.000006:13168011; State: migrating; ETA: due
2018-04-16 14:31:20 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:20 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 12s(total), 1s(copy); streamer: mysqld-bin.000006:13172429; State: migrating; ETA: due
2018-04-16 14:31:21 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:21 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 13s(total), 1s(copy); streamer: mysqld-bin.000006:13176847; State: migrating; ETA: due
2018-04-16 14:31:22 ERROR Unable to acquire lock gh-ost.3739543.lock
2018-04-16 14:31:22 ERROR Unable to acquire lock gh-ost.3739543.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 14s(total), 1s(copy); streamer: mysqld-bin.000006:13181267; State: migrating; ETA: due
2018-04-16 14:31:23 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:31:23 ERROR Unable to acquire lock gh-ost.3739536.lock
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 15s(total), 1s(copy); streamer: mysqld-bin.000006:13185687; State: migrating; ETA: due
2018-04-16 14:31:24 ERROR Unable to acquire lock gh-ost.3739543.lock
2018-04-16 14:31:24 ERROR Unable to acquire lock gh-ost.3739543.lock
..
Copy: 8/8 100.0%; Applied: 0; Backlog: 0/1000; Time: 1m0s(total), 1s(copy); streamer: mysqld-bin.000006:13384483; State: migrating; ETA: due
2018-04-16 14:32:09 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:32:09 ERROR Unable to acquire lock gh-ost.3739536.lock
2018-04-16 14:32:09 FATAL Unable to acquire lock gh-ost.3739536.lock
george.chilumbu@nangang-vm-db-pxc-test-1:/opt$
XtraDB Cluster unfortunately does not support GET_LOCK()
: https://www.percona.com/doc/percona-xtradb-cluster/LATEST/limitation.html, which is why gh-ost
is failing.
There's a discussion in #224, and see in particular https://github.com/github/gh-ost/issues/224#issuecomment-254593577 .
I don't have further insight at this time. For XtraDB cluster you may have to revert to pt-online-schema-change
.
Hmm perhaps on a single-master setup GET_LOCK()
can be executed. Please let me know if you do try it out.
How unfortunate coz all our databases use PXC. So it means gh-ost cannot work for us.
So for PXC, it still does not work if i run the command on the save instead of one of the masters?
So for PXC, it still does not work if i run the command on the save instead of one of the masters?
I wouldn't know. I was hoping you could tell me.
On the slave (whose master is part of a PXC), it also does not run. So without the --execute and --allow-on-master params, i get the following:
load=Threads_connected=20
2018/04/16 18:48:52 binlogsyncer.go:79: [info] create BinlogSyncer with config {99999 mysql slave_ip 3306 ghost false false <nil>}
2018/04/16 18:48:52 binlogsyncer.go:246: [info] begin to sync binlog from position (mysqld-bin.000029, 882245706)
2018/04/16 18:48:52 binlogsyncer.go:139: [info] register slave for master server slave_ip:3306
2018-04-16 18:49:52 FATAL connection was bad
And with the --execute param added, i get the following:
2018/04/16 18:50:50 binlogsyncer.go:79: [info] create BinlogSyncer with config {99999 mysql slave_ip 3306 ghost false false <nil>}
2018/04/16 18:50:50 binlogsyncer.go:246: [info] begin to sync binlog from position (mysqld-bin.000029, 882245706)
2018/04/16 18:50:50 binlogsyncer.go:139: [info] register slave for master server slave_ip:3306
2018-04-16 18:51:50 FATAL connection was bad
The IP on the --host is that of the slave FYI. But even when i change to that of the master (for the heck of it), i get the following error:
2018-04-16 18:54:44 FATAL dial tcp master_ip:3306: getsockopt: no route to host
So it does not seem to work even if you run from the slave.
Hi. When i run gh-ost on the master without the --execute using the command below:
I get the following error:
And when i run with the --execute option, i get the following error:
Either way, the command does not seem to be running as expected. I have also tried running on the slave with the initial commands you provided and still no luck. The tool seems to be promising as an alternative to Percona's pt-online-schema-change, but sad cannot get it to run.