Tencent / phxsql

A high availability MySQL cluster that guarantees data consistency between a master and slaves.
Other
2.47k stars 558 forks source link

sysbench 测试过程中slave挂掉 #77

Open Freyayan opened 7 years ago

Freyayan commented 7 years ago

你好, 我在用sysbench进行otlp.lua进行性能测试的时候,发现一个备机几点挂掉,前面sysbench准备数据是ok的,但是连接测试时出现备机节点mysql挂掉。

[root@hongxin59 tools]# ps -ef |grep phxsql root 41128 1 2 Nov09 ? 00:45:02 /cache1/phxsql/sbin/phxbinlogsvr_phxrpc root 41141 1 9 Nov09 ? 02:21:49 /cache1/phxsql/sbin/phxsqlproxy_phxrpc /cache1/phxsql/etc/phxsqlproxy.conf daemon root 41142 1 8 Nov09 ? 02:20:27 /cache1/phxsql/sbin/phxsqlproxy_phxrpc /cache1/phxsql/etc/phxsqlproxy.conf daem

sysbench测试脚本: sysbench --test=oltp --test=/cache1/sysbench/sysbench/tests/db/oltp.lua --mysql-host=180.101.24.57 --mysql-user=root --mysql-port=54321 --mysql-db=sbtest --oltp-tables-count=10 --oltp-table-size=1000000 --num-threads=200 --max-requests=100000 --report-interval=1 --max-time=600 --oltp-read-only=off run>>./sysbench200oltp.log

日志: 2016-11-10 10:41:43 42526 [ERROR] Error reading packet from server: Lost connection to MySQL server during query (server_errno=20 13) 2016-11-10 10:41:43 42526 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000003' at pos ition 2662 2016-11-10 10:41:43 42526 [Warning] Storing MySQL user name or password information in the master info repository is not secure a nd is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START S LAVE Syntax' in the MySQL Manual for more information. 2016-11-10 11:32:48 42526 [ERROR] InnoDB: Tried to read 16384 bytes at offset 161677312. Was only able to read 0. 2016-11-10 11:32:48 7fa152dac700 InnoDB: Operating system error number 5 in a file operation. InnoDB: Error number 5 means 'Input/output error'. InnoDB: Some operating system error numbers are described at InnoDB: http://dev.mysql.com/doc/refman/5.6/en/operating-system-error-codes.html 2016-11-10 11:32:48 42526 [ERROR] InnoDB: File (unknown): 'read' returned OS error 105. Cannot continue operation 161110 11:32:48 mysqld_safe Number of processes running now: 0 161110 11:32:48 mysqld_safe mysqld restarted 2016-11-10 11:32:48 0 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release. 2016-11-10 11:32:48 0 [Note] /cache1/phxsql/sbin/mysqld (mysqld 5.6.31-77.0-log) starting as process 31753 ... 2016-11-10 11:32:48 31753 [Note] InnoDB: Using atomics to ref count buffer pool pages 2016-11-10 11:32:48 31753 [Note] InnoDB: The InnoDB memory heap is disabled 2016-11-10 11:32:48 31753 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2016-11-10 11:32:48 31753 [Note] InnoDB: Memory barrier is not used 2016-11-10 11:32:48 31753 [Note] InnoDB: Compressed tables use zlib 1.2.3 2016-11-10 11:32:48 31753 [Note] InnoDB: Using CPU crc32 instructions 2016-11-10 11:32:48 31753 [Note] InnoDB: Initializing buffer pool, size = 200.0M 2016-11-10 11:32:48 31753 [Note] InnoDB: Completed initialization of buffer pool 2016-11-10 11:32:48 31753 [Note] InnoDB: Opened 3 undo tablespaces 2016-11-10 11:32:48 31753 [Note] InnoDB: Highest supported file format is Barracuda. 2016-11-10 11:32:48 31753 [Note] InnoDB: Log scan progressed past the checkpoint lsn 3664293153 2016-11-10 11:32:48 31753 [Note] InnoDB: Database was not shutdown normally! 2016-11-10 11:32:48 31753 [Note] InnoDB: Starting crash recovery. 2016-11-10 11:32:48 31753 [Note] InnoDB: Reading tablespace information from the .ibd files... 2016-11-10 11:32:48 31753 [Note] InnoDB: Restoring possible half-written data pages 2016-11-10 11:32:48 31753 [Note] InnoDB: from the doublewrite buffer... InnoDB: Doing recovery: scanned up to log sequence number 3666856472 InnoDB: Transaction 17522 was in the XA prepared state. InnoDB: 1 transaction(s) which must be rolled back or cleaned up InnoDB: in total 0 row operations to undo InnoDB: Trx id counter is 17920 2016-11-10 11:32:50 31753 [Note] InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: Last MySQL binlog file position 0 490071419, file name mysql-bin.000005 InnoDB: Starting in background the rollback of uncommitted transactions 2016-11-10 11:33:22 31753 [Note] InnoDB: 128 rollback segment(s) are active. 2016-11-10 11:33:22 7f083ca65700 InnoDB: Rollback of non-prepared transactions completed 2016-11-10 11:33:22 31753 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.31-77.0 started; log sequence number 366685 6472 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_cache_size=256m' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_read_status_frequency=1' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_commit_sync=0' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_read_block_size=128K' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_write_status_frequency=1' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_log_dir=//home/root/data/phxsql/pe rcona.workspace/tokudb' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_directio=1' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_fsync_log_period=1000' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_row_format=tokudb_fast' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_read_buf_size=128K' 2016-11-10 11:33:22 31753 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_data_dir=/home/root/data/phxsql/pe rcona.workspace/tokudb' 2016-11-10 11:33:22 31753 [Note] get last uuid f8ec778f-a625-11e6-8e48-90e2ba253bd0 done, last gtid [] 2016-11-10 11:33:22 31753 [Note] get last send gtid svrid [f8ec778f-a625-11e6-8e48-90e2ba253bd0] ret 0, last gtid [] 2016-11-10 11:33:22 31753 [Note] is_gtid_in_binlog_file read [/cache1/phxdata/percona.workspace/binlog/mysql-bin.000005] ret 0 2016-11-10 11:33:23 31753 [Note] [/cache1/phxdata/percona.workspace/binlog/mysql-bin.000005] read event null 2016-11-10 11:33:23 31753 [Note] truncate from [/cache1/phxdata/percona.workspace/binlog/mysql-bin.000005], ret 1 2016-11-10 11:33:23 31753 [Note] truncate_while_init ret 0, m_last_gtid_in_binlog 1fd15947-a623-11e6-8e36-001b21b92df0:7462 2016-11-10 11:33:23 31753 [Note] repl_phx_report_binlog_before_recovery:90 server_uuid [f8ec778f-a625-11e6-8e48-90e2ba253bd0] log _bin_index [/cache1/phxdata/percona.workspace/binlog/mysql-bin.index] ret 0 2016-11-10 11:33:23 31753 [Note] Recovering after a crash using /cache1/phxdata/percona.workspace/binlog/mysql-bin 2016-11-10 11:33:24 31753 [Note] Starting crash recovery... 2016-11-10 11:33:24 7f0877269720 InnoDB: Starting recovery for XA transactions... 2016-11-10 11:33:24 7f0877269720 InnoDB: Transaction 17522 in prepared state after recovery 2016-11-10 11:33:24 7f0877269720 InnoDB: Transaction contains changes to 5 rows 2016-11-10 11:33:24 7f0877269720 InnoDB: 1 transactions in prepared state after recovery 2016-11-10 11:33:24 31753 [Note] Found 1 prepared transaction(s) in InnoDB 2016-11-10 11:33:26 31753 [ERROR] InnoDB: Tried to read 16384 bytes at offset 161677312. Was only able to read 0. 2016-11-10 11:33:26 7f084adac700 InnoDB: Operating system error number 5 in a file operation. InnoDB: Error number 5 means 'Input/output error'. InnoDB: Some operating system error numbers are described at InnoDB: http://dev.mysql.com/doc/refman/5.6/en/operating-system-error-codes.html 2016-11-10 11:33:26 31753 [ERROR] InnoDB: File (unknown): 'read' returned OS error 105. Cannot continue operation 161110 11:33:26 mysqld_safe mysqld from pid file /cache1/phxdata/percona.workspace/data/percona.pid ended

请问这是什么原因,有什么解决办法??

Freyayan commented 7 years ago

不过系统日志中也出现下面报错,不知是不是磁盘坏块了

Nov 10 10:47:56 smartd[4475]: Device: /dev/sdb [SAT], 1016 Currently unreadable (pending) sectors (changed +1) Nov 10 10:47:56 smartd[4475]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 193 Nov 10 10:47:58 smartd[4475]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 206 Nov 10 10:47:59 smartd[4475]: Device: /dev/sdf [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 193 Nov 10 11:17:56 smartd[4475]: Device: /dev/sdb [SAT], 1017 Currently unreadable (pending) sectors (changed +1) Nov 10 11:17:56 smartd[4475]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 193 to 181 Nov 10 11:17:58 smartd[4475]: Device: /dev/sdf [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 193 to 200 Nov 10 11:18:00 smartd[4475]: Device: /dev/sdi [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 206 Nov 10 11:32:48 kernel: mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Nov 10 11:32:48 kernel: mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Nov 10 11:32:48 kernel: mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Nov 10 11:32:48 kernel: mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Nov 10 11:32:48 kernel: sd 0:0:1:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Nov 10 11:32:48 kernel: sd 0:0:1:0: [sdb] Sense Key : Medium Error [current] Nov 10 11:32:48 kernel: Info fld=0x40f91bc Nov 10 11:32:48 kernel: sd 0:0:1:0: [sdb] Add. Sense: Unrecovered read error Nov 10 11:32:48 kernel: sd 0:0:1:0: [sdb] CDB: Read(10): 28 00 04 0f 91 a2 00 00 20 00 Nov 10 11:32:48 kernel: end_request: critical medium error, dev sdb, sector 68129202 Nov 10 11:33:26 kernel: mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Nov 10 11:33:26 kernel: mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Nov 10 11:33:26 kernel: mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Nov 10 11:33:26 kernel: mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) Nov 10 11:33:26 kernel: sd 0:0:1:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Nov 10 11:33:26 kernel: sd 0:0:1:0: [sdb] Sense Key : Medium Error [current] Nov 10 11:33:26 kernel: Info fld=0x40f91bc Nov 10 11:33:26 kernel: sd 0:0:1:0: [sdb] Add. Sense: Unrecovered read error Nov 10 11:33:26 kernel: sd 0:0:1:0: [sdb] CDB: Read(10): 28 00 04 0f 91 a2 00 00 20 00 Nov 10 11:33:26 kernel: end_request: critical medium error, dev sdb, sector 68129202 Nov 10 11:47:56 smartd[4475]: Device: /dev/sdb [SAT], 1018 Currently unreadable (pending) sectors (changed +1) Nov 10 11:47:56 smartd[4475]: Device: /dev/sdb [SAT], SMART Prefailure Attribute: 5 Reallocated_Sector_Ct changed from 56 to 55 Nov 10 11:47:56 smartd[4475]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 181 to 200 Nov 10 11:47:57 smartd[4475]: Device: /dev/sdb [SAT], ATA error count increased from 26 to 28 Nov 10 11:47:58 smartd[4475]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 206 to 200

wodesuck commented 7 years ago

建议先试一下用单机的MySQL跑,看是否有同样问题。

Freyayan commented 7 years ago

单机测试: 遇到这样问题:

sysbench --test=oltp --test=/cache1/sysbench/sysbench/tests/db/oltp.lua --mysql-host=180.101.24.57 --mysql-user=root --mysql-port=11111 --mysql-db=sbtest --oltp-tables-count=10 --oltp-table-size=1000000 --num-threads=200 --max-requests=100000 --report-interval=1 --max-time=600 --oltp-read-only=off run>>./sysbench200oltp.log

[ 92s] threads: 200, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects: 0.00 ALERT: mysql_drv_query() for query 'COMMIT' failed: 2013 Lost connection to MySQL server during query

Freyayan commented 7 years ago

不好意思,这样的形式还是集群,我用127.0.0.1测试一下

Freyayan commented 7 years ago

你好,我在一台主机进行sysbench测试时ok的,没有报错。

但是一旦在集群测试:ALERT: mysql_drv_query() for query 'COMMIT' failed: 2013 Lost connection to MySQL server during query 抛出这样的报错,并自动停止测试。

测试期间主机节点日志: 016-11-09 10:18:55 40825 [Note] InnoDB: Using atomics to ref count buffer pool pages 2016-11-09 10:18:55 40825 [Note] InnoDB: The InnoDB memory heap is disabled 2016-11-09 10:18:55 40825 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2016-11-09 10:18:55 40825 [Note] InnoDB: Memory barrier is not used 2016-11-09 10:18:55 40825 [Note] InnoDB: Compressed tables use zlib 1.2.3 2016-11-09 10:18:55 40825 [Note] InnoDB: Using CPU crc32 instructions 2016-11-09 10:18:55 40825 [Note] InnoDB: Initializing buffer pool, size = 200.0M 2016-11-09 10:18:55 40825 [Note] InnoDB: Completed initialization of buffer pool 2016-11-09 10:18:55 40825 [Note] InnoDB: The first specified data file /cache1/phxdata/percona.workspace/innodb/ibdata did not exist: a new database to be created! 2016-11-09 10:18:55 40825 [Note] InnoDB: Setting file /cache1/phxdata/percona.workspace/innodb/ibdata size to 1024 MB 2016-11-09 10:18:55 40825 [Note] InnoDB: Database physically writes the file full: wait... InnoDB: Progress in MB: 100 200 300 400 500 600 700 800 900 1000 2016-11-09 10:19:02 40825 [Note] InnoDB: Setting log file /cache1/phxdata/percona.workspace/innodb/ib_logfile101 size to 100 MB InnoDB: Progress in MB: 100 2016-11-09 10:19:03 40825 [Note] InnoDB: Setting log file /cache1/phxdata/percona.workspace/innodb/ib_logfile1 size to 100 MB InnoDB: Progress in MB: 100 2016-11-09 10:19:04 40825 [Note] InnoDB: Setting log file /cache1/phxdata/percona.workspace/innodb/ib_logfile2 size to 100 MB InnoDB: Progress in MB: 100 2016-11-09 10:19:05 40825 [Note] InnoDB: Data file /cache1/phxdata/percona.workspace/innodb/undo001 did not exist: new to be created 2016-11-09 10:19:05 40825 [Note] InnoDB: Setting file /cache1/phxdata/percona.workspace/innodb/undo001 size to 10 MB 2016-11-09 10:19:05 40825 [Note] InnoDB: Database physically writes the file full: wait... 2016-11-09 10:19:05 40825 [Note] InnoDB: Data file /cache1/phxdata/percona.workspace/innodb/undo002 did not exist: new to be created 2016-11-09 10:19:05 40825 [Note] InnoDB: Setting file /cache1/phxdata/percona.workspace/innodb/undo002 size to 10 MB 2016-11-09 10:19:05 40825 [Note] InnoDB: Database physically writes the file full: wait... ...skipping... 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 Could not create logging file: Permission denied COULD NOT CREATE A LOGGINGFILE 20161110-153008.49068!Could not create logging file: Permission denied COULD NOT CREATE A LOGGINGFILE 20161110-153008.49068!2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1 2016-11-10 15:30:08 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -1

wodesuck commented 7 years ago

@Freyayan 主机phxbinlogsvr是否正常运行

BTW,单机测试的时候是否使用相同配置,打开了binlog

Freyayan commented 7 years ago

开始测试的时候主机phxbinlogsvr是否正常运行,但是测试停止后,主机备机phxbinlogsvr挂掉了。

单机测试时,binlog是打开的。。

wodesuck commented 7 years ago

@Freyayan 看一下各机binlogsvr日志,是什么原因挂掉。

Freyayan commented 7 years ago

2016-11-10 11:24:12 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 3364 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5268 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5247 2016-11-10 11:24:13 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 688 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5446 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5268 2016-11-10 11:24:17 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 3655 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5447 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5446 2016-11-10 11:24:20 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 2336 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5644 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5447 2016-11-10 11:24:25 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 1013 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5645 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5644 2016-11-10 11:24:27 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 114 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5646 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5645 2016-11-10 11:24:29 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 2649 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5798 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5646 2016-11-10 11:24:33 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 3296 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5845 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5798 2016-11-10 11:24:36 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 3000 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5996 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5845 Could not create logging file: Permission denied COULD NOT CREATE A LOGGINGFILE 20161110-112445.49382!Could not create logging file: Permission denied COULD NOT CREATE A LOGGINGFILE 20161110-112445.49382!2016-11-10 11:24:45 49382 [Note] repl_phx_report_binlog_to_binlogsvr:165 send binlog buffer failed, prev_log_file [ mysql-bin.000006 ], prev_log_pos [ 877858047 ], log_file [ mysql-bin.000006 ], log_pos [ 877904711 ], ret -1 2016-11-10 11:24:45 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 9208 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:6045 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5996 2016-11-10 11:24:45 49382 [Note] repl_phx_report_binlog_to_binlogsvr:165 send binlog buffer failed, prev_log_file [ mysql-bin.000006 ], prev_log_pos [ 877858047 ], log_file [ mysql-bin.000006 ], log_pos [ 877904711 ], ret 2000 2016-11-10 11:24:45 49382 [Note] repl_phx_report_binlog_to_binlogsvr:169 send binlog use 9209 ms, max gtid 1fd15947-a623-11e6-8e36-001b21b92df0:6045 binlog gtid 1fd15947-a623-11e6-8e36-001b21b92df0:5996 2016-11-10 11:24:45 49382 [Note] repl_phx_report_binlog_to_binlogsvr:182 send binlog buffer failed, prev_log_file [mysql-bin.000006], prev_log_pos [877858047], log_file [mysql-bin.000006], log_pos [877904711] maxgtid [1fd15947-a623-11e6-8e36-001b21b92df0:5996] my maxgtid [1fd15947-a623-11e6-8e36-001b21b92df0:6045] ret 2000 161110 11:24:46 mysqld_safe Number of processes running now: 0 161110 11:24:46 mysqld_safe mysqld restarted 2016-11-10 11:24:46 0 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release. 2016-11-10 11:24:46 0 [Note] /cache1/phxsql/sbin/mysqld (mysqld 5.6.31-77.0-log) starting as process 20828 ... 2016-11-10 11:24:46 20828 [Note] InnoDB: Using atomics to ref count buffer pool pages 2016-11-10 11:24:46 20828 [Note] InnoDB: The InnoDB memory heap is disabled 2016-11-10 11:24:46 20828 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2016-11-10 11:24:46 20828 [Note] InnoDB: Memory barrier is not used 2016-11-10 11:24:46 20828 [Note] InnoDB: Compressed tables use zlib 1.2.3 2016-11-10 11:24:46 20828 [Note] InnoDB: Using CPU crc32 instructions 2016-11-10 11:24:46 20828 [Note] InnoDB: Initializing buffer pool, size = 200.0M 2016-11-10 11:24:46 20828 [Note] InnoDB: Completed initialization of buffer pool 2016-11-10 11:24:46 20828 [Note] InnoDB: Opened 3 undo tablespaces 2016-11-10 11:24:46 20828 [Note] InnoDB: Highest supported file format is Barracuda. 2016-11-10 11:24:46 20828 [Note] InnoDB: Log scan progressed past the checkpoint lsn 3668086066 2016-11-10 11:24:46 20828 [Note] InnoDB: Database was not shutdown normally! 2016-11-10 11:24:46 20828 [Note] InnoDB: Starting crash recovery. 2016-11-10 11:24:46 20828 [Note] InnoDB: Reading tablespace information from the .ibd files... 2016-11-10 11:24:46 20828 [Note] InnoDB: Restoring possible half-written data pages 2016-11-10 11:24:46 20828 [Note] InnoDB: from the doublewrite buffer... InnoDB: Doing recovery: scanned up to log sequence number 3668970401 InnoDB: Transaction 10396 was in the XA prepared state. InnoDB: Transaction 10114 was in the XA prepared state. InnoDB: Transaction 10115 was in the XA prepared state. InnoDB: Transaction 10395 was in the XA prepared state. InnoDB: Transaction 10397 was in the XA prepared state. InnoDB: Transaction 10116 was in the XA prepa

在测试过程中发现主机有一段这样的日志

Freyayan commented 7 years ago

接下来开始recovery, InnoDB: Transaction 10112 was in the XA prepared state. InnoDB: Transaction 10113 was in the XA prepared state. InnoDB: Transaction 10394 was in the XA prepared state. InnoDB: 200 transaction(s) which must be rolled back or cleaned up InnoDB: in total 0 row operations to undo InnoDB: Trx id counter is 10752 2016-11-10 11:24:47 20828 [Note] InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: Last MySQL binlog file position 0 877858047, file name mysql-bin.000006 InnoDB: Starting in background the rollback of uncommitted transactions 2016-11-10 11:24:54 20828 [Note] InnoDB: 128 rollback segment(s) are active. 2016-11-10 11:24:54 7f1e16a10700 InnoDB: Rollback of non-prepared transactions completed 2016-11-10 11:24:54 20828 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.31-77.0 started; log sequence number 3668970401 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_cache_size=256m' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_read_status_frequency=1' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_commit_sync=0' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_read_block_size=128K' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_write_status_frequency=1' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_log_dir=//home/root/data/phxsql/percona.workspace/tokudb' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_directio=1' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_fsync_log_period=1000' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_row_format=tokudb_fast' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_read_buf_size=128K' 2016-11-10 11:24:54 20828 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_data_dir=/home/root/data/phxsql/percona.workspace/tokudb' 2016-11-10 11:24:54 20828 [Note] get last uuid 1fd15947-a623-11e6-8e36-001b21b92df0 done, last gtid [1fd15947-a623-11e6-8e36-001b21b92df0:6045] 2016-11-10 11:24:54 20828 [Note] get last send gtid svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret 0, last gtid [1fd15947-a623-11e6-8e36-001b21b92df0:6045] 2016-11-10 11:24:54 20828 [Note] is_gtid_in_binlog_file read [/cache1/phxdata/percona.workspace/binlog/mysql-bin.000006] ret 0 2016-11-10 11:24:56 20828 [Note] [/cache1/phxdata/percona.workspace/binlog/mysql-bin.000006] read event null 2016-11-10 11:24:56 20828 [Note] truncate from [/cache1/phxdata/percona.workspace/binlog/mysql-bin.000006], ret 1 2016-11-10 11:24:56 20828 [Note] truncate_while_init ret 0, m_last_gtid_in_binlog 1fd15947-a623-11e6-8e36-001b21b92df0:6045 2016-11-10 11:24:56 20828 [Note] repl_phx_report_binlog_before_recovery:90 server_uuid [1fd15947-a623-11e6-8e36-001b21b92df0] log_bin_index [/cache1/phxdata/percona.workspace/binlog/mysql-bin.index] ret 0 2016-11-10 11:24:56 20828 [Note] Recovering after a crash using /cache1/phxdata/percona.workspace/binlog/mysql-bin 2016-11-10 11:24:57 20828 [Note] Starting crash recovery... 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Starting recovery for XA transactions... 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10451 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10450 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10449 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10448 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10447 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10446 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10445 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10444 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10443 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10442 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10441 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10440 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10439 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10438 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10437 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10436 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10435 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10434 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10433 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10432 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10431 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10430 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10429 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10428 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10427 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10426 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction 10425 in prepared state after recovery 2016-11-10 11:24:57 7f1e4ffda720 InnoDB: Transaction contains changes to 4 rows

然后反复执行: 2016-11-10 15:01:57 49068 [Warning] /cache1/phxsql/sbin/mysqld: unknown variable 'loose_tokudb_data_dir=/home/root/data/phxsql/percona.workspace/tokudb' Could not create logging file: Permission denied COULD NOT CREATE A LOGGINGFILE 20161110-150158.49068!Could not create logging file: Permission denied COULD NOT CREATE A LOGGINGFILE 20161110-150158.49068!2016-11-10 15:01:58 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -202 2016-11-10 15:01:59 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -202 2016-11-10 15:02:00 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -202 2016-11-10 15:02:01 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -202 2016-11-10 15:02:02 49068 [Note] get last send gtid failed, svrid [1fd15947-a623-11e6-8e36-001b21b92df0] ret -202 2016-11-10 15:02:02 49068 [Note] InnoDB: Rollback of trx with id 44764 completed 2016-11-10 15:02:02 7f59b0333700 InnoDB: Rolling back trx with id 44739, 1 rows to undo 2016-11-10 15:02:03 49068 [Note] InnoDB: Rollback of trx with id 44739 completed 2016-11-10 15:02:03 7f59b0333700 InnoDB: Rolling back trx with id 44736, 1 rows to undo

wodesuck commented 7 years ago

@Freyayan 看一下binlogsvr的日志