openark / orchestrator

MySQL replication topology management and HA
Apache License 2.0
5.64k stars 933 forks source link

Discovery doesn't discover in extremely simple setup #1492

Open unixtastic opened 1 year ago

unixtastic commented 1 year ago

Orchestrator discovery only functions if manually run on all nodes starting at the master and working down the tree. I tried Orchestrator 3.2.6 on Ubuntu 18.04 and 22.04. All test Mysql servers are Percona 5.7.43-47 and are correctly replicating.

orchestrator.conf.json:

{
  "Debug": false,
  "MySQLTopologyUser": "orchestrator",
  "MySQLTopologyPassword": "orch_topology_password",
  "DiscoverByShowSlaveHosts": true,
  "InstancePollSeconds": 5,
  "BackendDB": "sqlite",
  "SQLite3DataFile": "/usr/local/orchestrator/orchestrator.sqlite3",
  "MySQLTopologySSLSkipVerify": true,
  "MySQLTopologyUseMutualTLS": false
}

mysql.cnf:

[mysqld]
server_id = 83
character-set-server = utf8
log_bin = /var/lib/mysql/binary/mysqld-bin.log
report_host = ip-10-200-1-83.eu-west-1.compute.internal
log_slave_updates = 1
master_info_repository = 'TABLE'

Mysql server_id and UUID is unique for each server.

If I turn debug on I see orchestrator is pulling information from SHOW SLAVE HOSTS on the master:

root@ip-10-200-1-119:/usr/local/orchestrator# orchestrator -c discover -i ip-10-200-1-109:3306
2023-09-09 14:07:40 DEBUG Hostname unresolved yet: ip-10-200-1-109
2023-09-09 14:07:40 DEBUG Cache hostname resolve ip-10-200-1-109 as ip-10-200-1-109
2023-09-09 14:07:40 DEBUG Connected to orchestrator backend: sqlite on /usr/local/orchestrator/orchestrator.sqlite3
2023-09-09 14:07:40 DEBUG Initializing orchestrator
2023-09-09 14:07:40 INFO Connecting to backend :3306: maxConnections: 128, maxIdleConns: 32
2023-09-09 14:07:40 DEBUG Hostname unresolved yet: ip-10-200-1-83.eu-west-1.compute.internal
2023-09-09 14:07:40 DEBUG Cache hostname resolve ip-10-200-1-83.eu-west-1.compute.internal as ip-10-200-1-83.eu-west-1.compute.internal
2023-09-09 14:07:40 DEBUG Hostname unresolved yet: ip-10-200-1-183.eu-west-1.compute.internal
2023-09-09 14:07:40 DEBUG Cache hostname resolve ip-10-200-1-183.eu-west-1.compute.internal as ip-10-200-1-183.eu-west-1.compute.internal
ip-10-200-1-109:3306

Yet the topology only lists the master and none of it's slaves:

ip-10-200-1-109:3306 [unknown,unchecked,5.7.43-47-log,rw,ROW,>>]

The user account for orchestrator and the account for replication is correctly setup on all servers.

unixtastic commented 1 year ago

The test above was with AWS instances. I see the exact same issue with digital ocean VMs with all hostnames in /etc/hosts.

# orchestrator -c discover -i db1
db1:3306
# orchestrator -c topology -i db1
db1:3306 [0s,ok,5.7.43-47-log,rw,ROW,>>]
# orchestrator -c discover -i db2
db2:3306
# orchestrator -c topology -i db1
db1:3306   [unknown,unchecked,5.7.43-47-log,rw,ROW,>>]
+ db2:3306 [0s,ok,5.7.43-47-log,rw,ROW,>>]
unixtastic commented 1 year ago

Using a mysql 8 orchestrator backend instead of sqlite behaves the same way.

jxs-2022 commented 1 year ago

设置两个地方:

  1. orchestrator.conf.json "HostnameResolveMethod": "none", "MySQLHostnameResolveMethod": "@@report_host",
  2. my.cnf report_host = 'x.x.x.x'
unixtastic commented 1 year ago

The changes suggested by jxs-2022 fixed it. I suggest updating the documentation.

If anyone can suggest a more active fork of this project I'd be grateful. I don't think anyone is approving PRs here.