Open supertylerc opened 4 years ago
This seems to only occur if I have job.group.task.resources.network
defined in more than one task, and only if I set client.network_interface = "nebula1"
. So, to ammend my summary of conditions:
resources.network
and no client.network_interface: successresources.network
and client.network_interface = "lo": successresources.network
s and client.network_interface = "nebula1": failureresources.network
and client.network_interface = "nebula1": failureresources.network
defined, and client.network_interface = "nebula1": successI've also tried this with a few combinations of specifying the network at the group level, but the result is ultimately the same.
Out of curiosity... does this problem go away for you, if you manually set the client.network_speed
to what you think your link speed should be? I ran into something similar a while back, when playing with nomad + nebula, and (in my case at least) it just kept saying the network resources were exhausted after about 1 job running on it, unless I manually set the network_speed
. I haven't bothered looking into how nomad does its detection of link speed, but I wonder if it just can't detect that properly for nebula? (or maybe what it does detect is very constrained?)
Nomad version
Operating system and Environment details
Issue
When using Nomad with Slack's Nebula, specifying a
network_interface
ofnebula1
always causes a job with more than one task (or multiple jobs with one task each of similar configuration but different ports) to fail to allocate, caused by an evaluation stuck inblocked
for some reason I haven't been able to determine. Settingnetwork_interface
tolo
works, so I do not think there is anything wrong with my configuration. It seems most likely that there is some kind of validation that happens fornetwork_interface
that prevents me from using a Nebula interface. It's also possible that something about the way an IP for an interface is retrieved is not working with Nebula, though I'm not sure why (I haven't dug deeply into Nomad's code, and I am unfortunately not well-versed in Go).To sum up:
client.network_interface
: successclient.network_interface = "lo"
: successclient.network_interface = "nebula1"
: successclient.network_interface = "nebula1"
: failureclient.network_interface = "nebula1"
: failureReproduction steps
0: Have a Nebula overlay deployed:
For a lighthouse/server:
For a node:
1: Have a Nomad server with the following configuration:
2: Have a Nomad node with the following configuration:
Job file (if appropriate)
Nomad Client logs (if appropriate)