Closed gbarton closed 10 months ago
Hi @gbarton and thanks for raising this issue.
I have not been able to reproduce this locally and taking a look at the logs you provided the agent is starting with the server mode enabled Server: true
. It would therefore seem something is incorrect in the configuration file being loaded compared to the config you entered into the issue. There are other items such as the log level and datacenter name which are not listed within the host2 config but are showing as non-default when looking at the logs. Could you please double check the config file being loaded by the agent, and share the full file if possible?
Thank you for your reply! I your hint about the config not being quite right was the key I needed. I was passing in hcl as a json file and it seemed to partially work for some very strange reason. As soon as I changed the file to .hcl, the config worked as expected.
Closing this as a non-issue, thank you!
Nomad version
Nomad v1.6.3 BuildDate 2023-10-30T12:58:10Z Revision e0497bff14378d68cad76a801cc0eba93ce05039
Operating system and Environment details
cat /etc/*-release PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" NAME="Debian GNU/Linux" VERSION_ID="11" VERSION="11 (bullseye)" VERSION_CODENAME=bullseye ID=debian
Issue
I have been trying to follow the manual steps here to setup a cluster: https://developer.hashicorp.com/nomad/tutorials/manage-clusters/clustering
I have host .1 setup as a server and client, that works, comes up, does its thing. I have host .2 setup as a client, it is bootstrapped with retry_join of .1, and server enabled = false, yet it always overrides the retry_join and makes itself a server.
The thing thats special about the env is that its using netmaker for a meshed wireguard network at 2 different sites. I have tested and have full connectivity, and thats what the forced host_network block, advertise, and bind_addr settings are for.
The consul settings were from looking at other issues about nomad reverting back to consul on failed connect. They had no effect.
The eventual goal is to link many mobile sites up with nomad for local access to content/capabilities via a wireguard mesh network. 3 server/clients will be hosted at a main site, and clients will run in several others.
Any help is greatly appreciated!
Reproduction steps
Start a nomad with the following configuration on host 1:
Start the following on host 2:
Expected Result
Host 2 client joins host 1 server.
Actual Result
Host 2 immediately nopes and overrides retry_join and joins itself.
Job file (if appropriate)
Nomad Server logs (if appropriate)
Nothing happens in them.
Nomad Client logs (if appropriate)
Host 2: