codership / galera

Synchronous multi-master replication library
GNU General Public License v2.0
451 stars 176 forks source link

galera ipv6 support ? #519

Open TingweiPei opened 6 years ago

TingweiPei commented 6 years ago

CentOS 7.3 x64、MySQL 5.7.21、galera 3.24 I'm trying to build a multi-master galera cluster, that works when I use ipv4, but fails when I use ipv6. The first node can be started successfully, but the other node can't join the cluster, configuration and errors are as follows. /etc/my.cnf

wsrep_cluster_address="gcomm://[fe80::9888:8eb6:52ef:17c9]:4567,[fe80::9403:a2fd:c561:b769]:4567,[fe80::6be0:f264:255a:b3f]:4567"
wsrep_provider_options= "gmcast.listen_addr=tcp://[::]:4567;ist.recv_addr=[fe80::9403:a2fd:c561:b769]:4568; "
wsrep_sst_method=rsync
server_id=2
#wsrep_node_address="192.168.0.124"
wsrep_node_address="fe80::9403:a2fd:c561:b769"

log errors

2018-09-11T06:13:52.782312Z 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = fe80::9403:a2fd:c561:b769; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://[::]:4567; gmcast.segment = 0; gmcast.version = 0; is
2018-09-11T06:13:52.797190Z 0 [Note] WSREP: GCache history reset: 18ea39e8-b180-11e8-8a1c-dfe2ce821470:0 -> 18ea39e8-b180-11e8-8a1c-dfe2ce821470:2
2018-09-11T06:13:52.820781Z 0 [Note] WSREP: Assign initial position for certification: 2, protocol version: -1
2018-09-11T06:13:52.820815Z 0 [Note] WSREP: wsrep_sst_grab()
2018-09-11T06:13:52.820821Z 0 [Note] WSREP: Start replication
2018-09-11T06:13:52.820839Z 0 [Note] WSREP: Setting initial position to 18ea39e8-b180-11e8-8a1c-dfe2ce821470:2
2018-09-11T06:13:52.820934Z 0 [Note] WSREP: protonet asio version 0
2018-09-11T06:13:52.821019Z 0 [Note] WSREP: Using CRC-32C for message checksums.
2018-09-11T06:13:52.821042Z 0 [Note] WSREP: backend: asio
2018-09-11T06:13:52.821093Z 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
2018-09-11T06:13:52.821165Z 0 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2018-09-11T06:13:52.821170Z 0 [Note] WSREP: restore pc from disk failed
2018-09-11T06:13:52.821696Z 0 [Note] WSREP: GMCast version 0
2018-09-11T06:13:52.821889Z 0 [Note] WSREP: (db3ddd95, 'tcp://[::]:4567') listening at tcp://[::]:4567
2018-09-11T06:13:52.821897Z 0 [Note] WSREP: (db3ddd95, 'tcp://[::]:4567') multicast: , ttl: 1
2018-09-11T06:13:52.822160Z 0 [Note] WSREP: EVS version 0
2018-09-11T06:13:52.822224Z 0 [Note] WSREP: gcomm: connecting to group 'galera_cluster1', peer '[fe80::9888:8eb6:52ef:17c9]:4567,[fe80::9403:a2fd:c561:b769]:4567,[fe80::6be0:f264:255a:b3f]:4567'
2018-09-11T06:13:55.828415Z 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
2018-09-11T06:13:55.828524Z 0 [Note] WSREP: view(view_id(NON_PRIM,db3ddd95,1) memb {
        db3ddd95,0
} joined {
} left {
} partitioned {
})
2018-09-11T06:13:56.329263Z 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.5071S), skipping check
2018-09-11T06:14:25.902590Z 0 [Note] WSREP: view((empty))
2018-09-11T06:14:25.902753Z 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():158
2018-09-11T06:14:25.902770Z 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend connection: -110 (Connection timed out)
2018-09-11T06:14:25.903016Z 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1458: Failed to open channel 'galera_cluster1' at 'gcomm://[fe80::9888:8eb6:52ef:17c9]:4567,[fe80::9403:a2fd:c561:b769]:4567,[fe80::6be0:f264:255a:b3f]:4567': -110 (Connection timed out)
2018-09-11T06:14:25.903040Z 0 [ERROR] WSREP: gcs connect failed: Connection timed out
2018-09-11T06:14:25.903048Z 0 [ERROR] WSREP: wsrep::connect(gcomm://[fe80::9888:8eb6:52ef:17c9]:4567,[fe80::9403:a2fd:c561:b769]:4567,[fe80::6be0:f264:255a:b3f]:4567) failed: 7
2018-09-11T06:14:25.903052Z 0 [ERROR] Aborting
claudionanni commented 5 years ago

Same here. It looks like it only works using local ipv6 address [::], but that's not usable to point to remote nodes of course. Apparently some problem with ipv6 scope_id resolution.

dciabrin commented 5 years ago

I can confirm that galera 25.3.26 fixes the IPv6 connection issue for me.

I have a 3-node cluster <node1,node2,node3>, where node is a FQDN that resolves to an IPv6 address.

I can bootstrap the 3-node cluster with the following config, e.g. for node2: [mysqld] wsrep_on=ON skip-name-resolve=1 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 innodb_doublewrite=1 max_connections=2048 query_cache_size=0 query_cache_type=0 bind_address=node2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_name="ratester" wsrep_cluster_address="gcomm://node1,node2,node3" wsrep_provider_options = gmcast.listen_addr=tcp://[fd00:be38:af4c:a9a6:0:ee39:f850:c5f2]:4567; wsrep_slave_threads=1 wsrep_certify_nonPK=1 wsrep_max_ws_rows=131072 wsrep_max_ws_size=1073741824 wsrep_debug=0 wsrep_convert_LOCK_to_trx=0 wsrep_retry_autocommit=1 wsrep_auto_increment_control=1 wsrep_drupal_282555_workaround=0 wsrep_causal_reads=0 wsrep_notify_cmd= wsrep_sst_method=rsync

The bind_address resolves to the same IPv6 address as the one in configured in gmcast.listen_addr

IST and SST automatically use the fqdn specified in bind_address, so all traffic goes through the same interface and IPv6 address.

alexander-krug commented 1 year ago

@TingweiPei @claudionanni I created working galera.cnf config files that have a 3-server MariaDB 10.3 setup running in CentOS8. Here is the link to my repo with the config files and details on how to set everything up: IPv6-only Galeria Cluster config files I hope this helps!