Open dreadushka opened 3 years ago
Are those bare-metal hosts or VMs? Which operating system?
Are those bare-metal hosts or VMs? Which operating system?
VMs. RHEL 7
Hello @dreadushka - Apologies for the late reply!
We found this issue and fixed it in this PR and the fix was introduced into Consul v1.7.3. The larger discussion around this issue can be found here. If this is occurring on > v1.7.3, please include the version number of Consul Clients and Servers, as well as your configuration files and commands used to run Consul. -disable-host-node-id
can be used to partially mitigate this as well. Please let me know your results with using that flag here. In the meantime, I'll hold this open until 2/22 as "waiting-reply".
Thank you for submitting your issue!
Cluster started from 1.4.2. I updated two tomes from 1.4.2 to 1.6.2 and from 1.6.2 to 1.8.4. Version 1.8.4 is on servers and agents now.
Server config:
{
"advertise_addr": "10.15.29.191",
"bind_addr": "10.15.29.191",
"domain": "consul",
"bootstrap_expect": 3,
"server": true,
"datacenter": "K",
"data_dir": "/var/consul",
"encrypt": "",
"dns_config": {
"allow_stale": true,
"max_stale": "15s"
},
"retry_join": [
"10.15.29.191",
"10.15.29.185",
"10.15.29.186",
"10.15.29.164",
"10.15.29.165"
],
"retry_interval": "10s",
"retry_max": 100,
"skip_leave_on_interrupt": true,
"leave_on_terminate": false,
"ports": {
"dns": 53,
"http": 8500
},
"recursors": [
"10.15.40.124",
"172.24.40.3",
"172.24.40.4",
"10.15.40.7",
"10.15.40.8"
],
"rejoin_after_leave": true,
"addresses": {
"http": "0.0.0.0",
"dns": "0.0.0.0"
}
}
Server runs using systemd:
[Service]
EnvironmentFile=-/etc/sysconfig/consul
Environment=GOMAXPROCS=2
ExecStart=/usr/local/bin/consul agent -config-dir=/etc/consul.d/server -rejoin -ui -data-dir=/var/consul
Client config:
{
"bind_addr": "10.206.159.155",
"domain": "consul",
"datacenter": "k",
"data_dir": "/var/consul",
"encrypt": "",
"retry_join": [
"10.15.29.191",
"10.15.29.185",
"10.15.29.186",
"10.15.29.164",
"10.15.29.165"
],
"retry_interval": "10s",
"retry_max": 100,
"skip_leave_on_interrupt": true,
"leave_on_terminate": false,
"ports": {
"dns": 53,
"http": 8500
},
"recursors": [
"10.15.40.124",
"172.24.40.3",
"172.24.40.4",
"10.15.40.7",
"10.15.40.8"
],
"rejoin_after_leave": true,
"addresses": {
"http": "0.0.0.0",
"dns": "0.0.0.0"
}
}
Client's part of systemd config:
[Service]
EnvironmentFile=-/etc/sysconfig/consul
Environment=GOMAXPROCS=2
ExecStart=/usr/local/bin/consul agent -config-dir=/etc/consul.d/agent -data-dir=/var/consul
I'm seeing the same happen with consul 1.7.13 clients
Consul agent generated the same node ID for next hosts: MYUGF-DBS302P (created about 1 year ago) MYUGF-DBS311P (created two days ago)
Should i use parameter -disable-host-node-id for all my agents, to prevent situation like this?