dial-once / docker-mariadb-galera

Clusterable MariaDB galera cluster with auto discovery - made for Docker Cloud
GNU General Public License v3.0
9 stars 1 forks source link

Second Node Dies After Startup (local run) #5

Open madhavajay opened 8 years ago

madhavajay commented 8 years ago

After starting the second node I get this error and it quits.

2016-10-18 13:14:36 139914817562368 [Note] WSREP: view((empty))
2016-10-18 13:14:36 139914817562368 [Note] WSREP: gcomm: closed
2016-10-18 13:14:36 139914817562368 [Note] WSREP: mysqld: Terminated.
/run.sh: line 118:   169 Aborted                 mysqld --datadir="$DATADIR" --wsrep_node_address=$HOSTNAME --wsrep_cluster_address=$CLUSTER_ADDR```
madhavajay commented 8 years ago
docker run -it -e MYSQL_ALLOW_EMPTY_PASSWORD=true --name=mariadb-2 -e HOSTNAME=mariadb-2 --rm --link mariadb-1:mariadb-1 dialonce/mariadb-galera:latest
#
# Galera Cluster: container settings
#

[server]
bind-address=0.0.0.0
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
query_cache_size=0
query_cache_type=0
max_allowed_packet=32M

[galera]
wsrep_on=ON
wsrep_provider="/usr/lib/galera/libgalera_smm.so"
wsrep-sst-method=rsync

#
# Optional setting
#

# Tune this value for your system, roughly 2x cores; see https://mariadb.com/kb/en/mariadb/galera-cluster-system-variables/#wsrep_slave_threads
wsrep_slave_threads=1

# innodb_flush_log_at_trx_commit=0
Initializing database
2016-10-18 13:22:01 140148423899072 [Note] /usr/sbin/mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 48 ...
2016-10-18 13:22:01 140148423899072 [ERROR] WSREP: rsync SST method requires wsrep_cluster_address to be configured on startup.
2016-10-18 13:22:01 140148423899072 [Warning] You need to use --log-bin to make --binlog-format work.
2016-10-18 13:22:01 7f76d905b7c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html.
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: The InnoDB memory heap is disabled
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Using Linux native AIO
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Using SSE crc32 instructions
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Initializing buffer pool, size = 256.0M
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Completed initialization of buffer pool
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: The first specified data file ./ibdata1 did not exist: a new database to be created!
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Setting file ./ibdata1 size to 12 MB
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Database physically writes the file full: wait...
2016-10-18 13:22:01 140148423899072 [Note] InnoDB: Setting log file ./ib_logfile101 size to 48 MB
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: Setting log file ./ib_logfile1 size to 48 MB
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
2016-10-18 13:22:02 140148423899072 [Warning] InnoDB: New log files created, LSN=45883
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: Doublewrite buffer not found: creating new
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: Doublewrite buffer created
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: 128 rollback segment(s) are active.
2016-10-18 13:22:02 140148423899072 [Warning] InnoDB: Creating foreign key constraint system tables.
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: Foreign key constraint system tables created
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: Creating tablespace and datafile system tables.
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: Tablespace and datafile system tables created.
2016-10-18 13:22:02 140148423899072 [Note] InnoDB: Waiting for purge to start
2016-10-18 13:22:02 140148423899072 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.32-78.1 started; log sequence number 0
2016-10-18 13:22:03 140147702101760 [Note] InnoDB: Dumping buffer pool(s) not yet started
2016-10-18 13:22:06 140376544741312 [Note] /usr/sbin/mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 76 ...
2016-10-18 13:22:06 140376544741312 [ERROR] WSREP: rsync SST method requires wsrep_cluster_address to be configured on startup.
2016-10-18 13:22:06 140376544741312 [Warning] You need to use --log-bin to make --binlog-format work.
2016-10-18 13:22:06 7fabf61587c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html.
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: The InnoDB memory heap is disabled
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Using Linux native AIO
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Using SSE crc32 instructions
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Initializing buffer pool, size = 256.0M
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Completed initialization of buffer pool
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Highest supported file format is Barracuda.
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: 128 rollback segment(s) are active.
2016-10-18 13:22:06 140376544741312 [Note] InnoDB: Waiting for purge to start
2016-10-18 13:22:06 140376544741312 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.32-78.1 started; log sequence number 1616799
2016-10-18 13:22:06 140375834461952 [Note] InnoDB: Dumping buffer pool(s) not yet started
2016-10-18 13:22:08 140501271156672 [Note] /usr/sbin/mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 105 ...
2016-10-18 13:22:08 140501271156672 [ERROR] WSREP: rsync SST method requires wsrep_cluster_address to be configured on startup.
2016-10-18 13:22:08 140501271156672 [Warning] You need to use --log-bin to make --binlog-format work.
2016-10-18 13:22:08 7fc9005b97c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html.
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: The InnoDB memory heap is disabled
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Using Linux native AIO
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Using SSE crc32 instructions
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Initializing buffer pool, size = 256.0M
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Completed initialization of buffer pool
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Highest supported file format is Barracuda.
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: 128 rollback segment(s) are active.
2016-10-18 13:22:09 140501271156672 [Note] InnoDB: Waiting for purge to start
2016-10-18 13:22:09 140501271156672 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.32-78.1 started; log sequence number 1616809
2016-10-18 13:22:09 140500560480000 [Note] InnoDB: Dumping buffer pool(s) not yet started

PLEASE REMEMBER TO SET A PASSWORD FOR THE MariaDB root USER !
To do so, start the server, then issue the following commands:

'/usr/bin/mysqladmin' -u root password 'new-password'
'/usr/bin/mysqladmin' -u root -h  password 'new-password'

Alternatively you can run:
'/usr/bin/mysql_secure_installation'

which will also give you the option of removing the test
databases and anonymous user created by default.  This is
strongly recommended for production servers.

See the MariaDB Knowledgebase at http://mariadb.com/kb or the
MySQL manual for more instructions.

Please report any problems at http://mariadb.org/jira

The latest information about MariaDB is available at http://mariadb.org/.
You can find additional information about the MySQL part at:
http://dev.mysql.com
Support MariaDB development by buying support/new features from MariaDB
Corporation Ab. You can contact us about this at sales@mariadb.com.
Alternatively consider joining our community based development effort:
http://mariadb.com/kb/en/contributing-to-the-mariadb-project/

Database initialized
MySQL init process in progress...
2016-10-18 13:22:11 140145844123584 [Note] mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 132 ...
2016-10-18 13:22:11 140145844123584 [ERROR] WSREP: rsync SST method requires wsrep_cluster_address to be configured on startup.
2016-10-18 13:22:11 140145844123584 [Warning] You need to use --log-bin to make --binlog-format work.
2016-10-18 13:22:11 7f763f4177c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html.
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: The InnoDB memory heap is disabled
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Using Linux native AIO
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Using SSE crc32 instructions
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Initializing buffer pool, size = 256.0M
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Completed initialization of buffer pool
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Highest supported file format is Barracuda.
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: 128 rollback segment(s) are active.
2016-10-18 13:22:11 140145844123584 [Note] InnoDB: Waiting for purge to start
2016-10-18 13:22:11 140145844123584 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.32-78.1 started; log sequence number 1616819
2016-10-18 13:22:11 140145130964736 [Note] InnoDB: Dumping buffer pool(s) not yet started
2016-10-18 13:22:11 140145844123584 [Note] Plugin 'FEEDBACK' is disabled.
2016-10-18 13:22:11 140145844123584 [Warning] 'user' entry 'root@55fbef69f5f3' ignored in --skip-name-resolve mode.
2016-10-18 13:22:11 140145844123584 [Warning] 'proxies_priv' entry '@% root@55fbef69f5f3' ignored in --skip-name-resolve mode.
2016-10-18 13:22:11 140145844123584 [Note] mysqld: ready for connections.
Version: '10.1.18-MariaDB-1~jessie'  socket: '/var/run/mysqld/mysqld.sock'  port: 0  mariadb.org binary distribution
Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
2016-10-18 13:22:13 140145843256064 [Warning] 'proxies_priv' entry '@% root@55fbef69f5f3' ignored in --skip-name-resolve mode.
2016-10-18 13:22:13 140145842952960 [Note] mysqld: Normal shutdown

2016-10-18 13:22:13 140145842952960 [Note] Event Scheduler: Purging the queue. 0 events
2016-10-18 13:22:13 140145114179328 [Note] InnoDB: FTS optimize thread exiting.
2016-10-18 13:22:13 140145842952960 [Note] InnoDB: Starting shutdown...
2016-10-18 13:22:15 140145842952960 [Note] InnoDB: Shutdown completed; log sequence number 1616829
2016-10-18 13:22:15 140145842952960 [Note] mysqld: Shutdown complete

MySQL init process done. Ready for start up.

2016-10-18 13:22:16 140339257296832 [Note] mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 170 ...
2016-10-18 13:22:16 140339257296832 [Note] WSREP: Read nil XID from storage engines, skipping position init
2016-10-18 13:22:16 140339257296832 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
2016-10-18 13:22:16 140339257296832 [Note] WSREP: wsrep_load(): Galera 25.3.18(r3632) by Codership Oy <info@codership.com> loaded successfully.
2016-10-18 13:22:16 140339257296832 [Note] WSREP: CRC-32C: using hardware acceleration.
2016-10-18 13:22:16 140339257296832 [Warning] WSREP: Could not open state file for reading: '/data//grastate.dat'
2016-10-18 13:22:16 140339257296832 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
2016-10-18 13:22:16 140339257296832 [Note] WSREP: Passing config to GCS: base_dir = /data/; base_host = mariadb-2; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /data//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo
2016-10-18 13:22:16 140339020400384 [Note] WSREP: Service thread queue flushed.
2016-10-18 13:22:16 140339257296832 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2016-10-18 13:22:16 140339257296832 [Note] WSREP: wsrep_sst_grab()
2016-10-18 13:22:16 140339257296832 [Note] WSREP: Start replication
2016-10-18 13:22:16 140339257296832 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2016-10-18 13:22:16 140339257296832 [Note] WSREP: protonet asio version 0
2016-10-18 13:22:16 140339257296832 [Note] WSREP: Using CRC-32C for message checksums.
2016-10-18 13:22:16 140339257296832 [Note] WSREP: backend: asio
2016-10-18 13:22:16 140339257296832 [Note] WSREP: gcomm thread scheduling priority set to other:0
2016-10-18 13:22:16 140339257296832 [Warning] WSREP: access file(/data//gvwstate.dat) failed(No such file or directory)
2016-10-18 13:22:16 140339257296832 [Note] WSREP: restore pc from disk failed
2016-10-18 13:22:16 140339257296832 [Note] WSREP: GMCast version 0
2016-10-18 13:22:21 140339257296832 [Warning] WSREP: Failed to resolve tcp://mariadb-2:4567
2016-10-18 13:22:21 140339257296832 [Note] WSREP: (e35bb6f3, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2016-10-18 13:22:21 140339257296832 [Note] WSREP: (e35bb6f3, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2016-10-18 13:22:21 140339257296832 [Note] WSREP: EVS version 0
2016-10-18 13:22:21 140339257296832 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer 'mariadb-2:,mariadb-1:'
2016-10-18 13:22:21 140339257296832 [Note] WSREP: (e35bb6f3, 'tcp://0.0.0.0:4567') connection established to cf12e3e6 tcp://172.17.0.2:4567
2016-10-18 13:22:21 140339257296832 [Note] WSREP: (e35bb6f3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2016-10-18 13:22:21 140339257296832 [Note] WSREP: declaring cf12e3e6 at tcp://172.17.0.2:4567 stable
2016-10-18 13:22:21 140339257296832 [Note] WSREP: Node cf12e3e6 state prim
2016-10-18 13:22:21 140339257296832 [Note] WSREP: view(view_id(PRIM,cf12e3e6,2) memb {
    cf12e3e6,0
    e35bb6f3,0
} joined {
} left {
} partitioned {
})
2016-10-18 13:22:21 140339257296832 [Note] WSREP: save pc into disk
2016-10-18 13:22:22 140339257296832 [Note] WSREP: gcomm: connected
2016-10-18 13:22:22 140339257296832 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2016-10-18 13:22:22 140339257296832 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2016-10-18 13:22:22 140339257296832 [Note] WSREP: Opened channel 'my_wsrep_cluster'
2016-10-18 13:22:22 140339257296832 [Note] WSREP: Waiting for SST to complete.
2016-10-18 13:22:22 140338958165760 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2016-10-18 13:22:22 140338958165760 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2016-10-18 13:22:22 140338958165760 [Note] WSREP: STATE EXCHANGE: sent state msg: e6a60ad3-9535-11e6-b5ae-2e86c99933d6
2016-10-18 13:22:22 140338958165760 [Note] WSREP: STATE EXCHANGE: got state msg: e6a60ad3-9535-11e6-b5ae-2e86c99933d6 from 0 (c781480c62b2)
2016-10-18 13:22:22 140338958165760 [Note] WSREP: STATE EXCHANGE: got state msg: e6a60ad3-9535-11e6-b5ae-2e86c99933d6 from 1 (55fbef69f5f3)
2016-10-18 13:22:22 140338958165760 [Note] WSREP: Quorum results:
    version    = 4,
    component  = PRIMARY,
    conf_id    = 1,
    members    = 1/2 (joined/total),
    act_id     = 0,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = d21358c8-9535-11e6-a011-b6dd76e494f1
2016-10-18 13:22:22 140338958165760 [Note] WSREP: Flow-control interval: [23, 23]
2016-10-18 13:22:22 140338958165760 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 0)
2016-10-18 13:22:22 140339256982272 [Note] WSREP: State transfer required:
    Group state: d21358c8-9535-11e6-a011-b6dd76e494f1:0
    Local state: 00000000-0000-0000-0000-000000000000:-1
2016-10-18 13:22:22 140339256982272 [Note] WSREP: New cluster view: global state: d21358c8-9535-11e6-a011-b6dd76e494f1:0, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 3
2016-10-18 13:22:22 140339256982272 [Warning] WSREP: Gap in state sequence. Need state transfer.
2016-10-18 13:22:22 140338928809728 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address 'mariadb-2' --datadir '/data/'   --parent '170'  '' '
2016-10-18 13:22:22 140339256982272 [Note] WSREP: Prepared SST request: rsync|mariadb-2:4444/rsync_sst
2016-10-18 13:22:22 140339256982272 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-10-18 13:22:22 140339256982272 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-10-18 13:22:22 140339020400384 [Note] WSREP: Service thread queue flushed.
2016-10-18 13:22:22 140339256982272 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
2016-10-18 13:22:22 140339020400384 [Note] WSREP: Service thread queue flushed.
2016-10-18 13:22:22 140339256982272 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (d21358c8-9535-11e6-a011-b6dd76e494f1): 1 (Operation not permitted)
     at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2016-10-18 13:22:22 140338958165760 [Note] WSREP: Member 1.0 (55fbef69f5f3) requested state transfer from '*any*'. Selected 0.0 (c781480c62b2)(SYNCED) as donor.
2016-10-18 13:22:22 140338958165760 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
2016-10-18 13:22:22 140339256982272 [Note] WSREP: Requesting state transfer: success, donor: 0
2016-10-18 13:22:22 140338958165760 [Warning] WSREP: 0.0 (c781480c62b2): State transfer to 1.0 (55fbef69f5f3) failed: -255 (Unknown error 255)
2016-10-18 13:22:22 140338958165760 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():736: Will never receive state. Need to abort.
2016-10-18 13:22:22 140338958165760 [Note] WSREP: gcomm: terminating thread
2016-10-18 13:22:22 140338958165760 [Note] WSREP: gcomm: joining thread
2016-10-18 13:22:22 140338958165760 [Note] WSREP: gcomm: closing backend
2016-10-18 13:22:23 140338958165760 [Note] WSREP: view(view_id(NON_PRIM,cf12e3e6,2) memb {
    e35bb6f3,0
} joined {
} left {
} partitioned {
    cf12e3e6,0
})
2016-10-18 13:22:23 140338958165760 [Note] WSREP: view((empty))
2016-10-18 13:22:23 140338958165760 [Note] WSREP: gcomm: closed
2016-10-18 13:22:23 140338958165760 [Note] WSREP: mysqld: Terminated.
madhavajay commented 8 years ago

Maybe its this part?

[Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (d21358c8-9535-11e6-a011-b6dd76e494f1): 1 (Operation not permitted)
madhavajay commented 8 years ago

I have tried a lot of different things and read a lot but still can't figure this out. Everything looks correct in the scripts, the first node starts with itself as the clustering hosts and creates a new cluster. The second host starts with both their hosts in the clustering param and makes contact with the host successfully and for several seconds they appear in sync as both logs show the same output and they exchange data but then it just terminates. Im using current stable Docker on Mac and the only change I made was the host bound external port for the first nodes 3306 to 13306 which shouldnt cause any issues.

This seems like a really awesome setup, but its a pity its not working for me, any help would be greatly appreciated.

PuKoren commented 8 years ago

Hello @madhavajay !

Sorry for taking so long to answer. I have the exact same issue on my local comp, the README instructions was made for a target usage, and is not working yet on local computer. I think it is because of Galera ports and Docker internal way of exposing ports with linked containers.

However I can guarantee you that is works good on Docker Cloud and Kubernetes and the likes, where containers can reach other containers with all exposed ports (we use it in production with the provided Docker Cloud yml and the galera cluster have a one week uptime for now, we noticed no failure).

I will have to take a look a it, thanks for creating the issue

madhavajay commented 8 years ago

Yeah its weird i suspected ports. So I installed netstat and it seems that the primary node is only listening on 3306 and 4567, does it use this to initiate the connection and then open 4444, or is 4444 simply missing? I dont know enough about Galera to comment, but I agree it seems to be related to some kind of wsrep connection issue.

Could it be worth trying an alternative sst method? Also I noticed the documentation uses underscores but the conf has - for: wsrep-sst-method=rsync

madhavajay commented 8 years ago

Actually could even be port 4568 for IST

Error:

2016-10-18 13:22:22 140339256982272 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (d21358c8-9535-11e6-a011-b6dd76e494f1): 1 (Operation not permitted)
     at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.

http://galeracluster.com/documentation-webpages/firewallsettings.html 3306 For MySQL client connections and State Snapshot Transfer that use the mysqldump method. 4567 For Galera Cluster replication traffic, multicast replication uses both UDP transport and TCP on this port. 4568 For Incremental State Transfer. 4444 For all other State Snapshot Transfer.

Also is UDP needed for port 4567, or is it just optional?

PuKoren commented 8 years ago

TBH I don't know Galera enough to give you an answer on this one. I'll try to tweak the setup a bit and see what configuration allow it to work locally to understand what actually happen.

I think it may be because first node try to resolve second node hostname (mariadb-2) using its name, but since it is not linked it can't resolve it

Thanks for the heads up on the conf file, I will fix the syntax of wsrep-sst-method

madhavajay commented 8 years ago

When I debug the commands the first node starts without mariadb-2 in its gcomm:// list, and i thought that was expected because if i put it in the conf file then even with the --wsrep-new-cluster param it still hangs trying to connect to mariadb-2 which hasnt started yet.

It doesn't explain why the second node seems to connect and do some form of communication with the first node, it seems like they talk but then cant sync...