Open Gurpartap opened 5 months ago
Similar to host_network { cidr = "…", interface = "…" }
, replacing the above code with something like the following could work.
client {
enabled = true
network_interface = "eth0"
network_cidr = "192.168.120.225/17"
}
func (f *NetworkFingerprint) Fingerprint(req *FingerprintRequest, resp *FingerprintResponse) error {
//…
uniqueIP := ""
for _, nwResource := range nwResources {
logger.Debug("detected interface IP", "IP", nwResource.IP)
if uniqueIP == "" && nwResource.CIDR == cfg.NetworkCIDR {
uniqueIP = nwResource.IP
}
}
if uniqueIP == "" && len(nwResources) > 0 {
// setting the first IP as unique IP for the node
uniqueIP = nwResources[0].IP
}
resp.AddAttribute("unique.network.ip-address", uniqueIP)
//…
}
This allows for more explicit unique.network.ip-address setting, rather than unexpected behaviour across a restart where a system may have shuffled the order of addrs within an interface. Related to: https://github.com/hashicorp/nomad/issues/10179
Hi @Gurpartap! Thanks for the thorough digging into this. I think you're on the right track here in terms of the fix too, but it seems likely that we'll want to also allow explicit IP address setting (with a go-sockaddr template) as well. I'll mark this for roadmapping, but if you're interesting in opening a PR we'd be happy to review it as well!
Nomad version
Operating system and Environment details
Debian 11 on Linode (with Linode's Network Helper enabled)
Issue
TLDR; Nomad client network_interface should allow for selection of a different addr resource within the same interface
Nomad is hard-coded to select the first addr in client's network_interface (which is used to fingerprint for
unique.network.ip-address
).This behaviour renders nomad unusable for certain networking configurations. e.g. Linode with auto-networking helper enabled (default), which adds both public and private ip to the same eth0 interface.
In such a situation, rather than hard-coded selection of the first
nwResources[0].IP
entry, which is the public IP, Nomad should let us select private IP's resource for the purpose of fingerprint forunique.network.ip-address
from this interface.In other words, to fingerprint
unique.network.ip-address=$my_private_ip
in the above case, the client configuration should provide a way to choose which of the multiple available address resources from an interface should be used.Related code:
Apparently the behaviour has been recognized as, "Deprecated, setting the first IP as unique IP for the node", but is yet to be worked upon:
https://github.com/hashicorp/nomad/blob/83720740f5a7f4053af2ba45dc687964de2a93cb/client/fingerprint/network.go#L111-L120
Reproduction steps
Linode's automatic Network Helper tool sets up something like this:
Current behaviour
Proposed behaviour
Workaround
The only workaround to this that I've been able to come up with is setting up a dummy interface on the system. And then setting:
It works but comes with its own oddities. See https://github.com/hashicorp/nomad/issues/3675#issuecomment-504660287.
Other considerations
https://github.com/hashicorp/nomad/issues/3675#issuecomment-504660287 The issue with the workaround missed on fixing the use case. It introduced sockaddr templating to network_interface config.
https://github.com/hashicorp/go-sockaddr sockaddr templating does not help with this since network_interface only accepts interface names and not ip addresses or CIDRs.
https://github.com/hashicorp/nomad/issues/19554 Apart from, say,
eth0
's resource label likeeth0:1
, network interface names themselves can also contain colons in them. Perhaps the fix could first look foreth0:1
interface, and then foreth0:1
labelled resource withineth0
interface, in that order.https://github.com/hashicorp/nomad/issues/11069 This seems related but with insufficient troubleshooting by the original poster?
https://discuss.hashicorp.com/t/how-to-change-unique-network-ip-address-for-a-node/22696