Open nickethier opened 4 years ago
It's really sad to see that this is considered an enhancement as it was brought up several times as a usecase in #646 by @skozin / @rkno82
Why can't 0.0.0.0/0 be consideret as a valid cidr and then bypass interface fingerprinting on that mask
@nickethier Any updates? Our usecase is still blocked
I would like to echo that I am puzzled.
Nomad is incredibly easy to run in DigitalOcean and a sobbing frustration in Google Cloud Platform. The reason for that is that in DigitalOcean you can get a VM that has a network interface exposed to the public network, so if you want to expose your service to the internet it's pretty easy. VMs in Google Compute Engine do not work that way. You get only a private interface with an RFC 1918 address:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000
link/ether 42:01:0a:8a:00:38 brd ff:ff:ff:ff:ff:ff
inet 10.138.0.56/32 scope global dynamic ens4
valid_lft 52864sec preferred_lft 52864sec
inet6 fe80::4001:aff:fe8a:38/64 scope link
valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:1a:e4:e0:40 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:1aff:fee4:e040/64 scope link
valid_lft forever preferred_lft forever
7: vethb75d369@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default
link/ether fe:d7:96:26:6c:2f brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::fcd7:96ff:fe26:6c2f/64 scope link
valid_lft forever preferred_lft forever
13: veth62b88d1@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default
link/ether d2:6a:75:7f:5d:0c brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::d06a:75ff:fe7f:5d0c/64 scope link
valid_lft forever preferred_lft forever
When public networking is allowed to the instance, I am able to hit a service via the NAT address that Google assigns to it. However, what I am not able to do is use a TCP/UDP loadbalancer to point at the service, because the service is bound to the ens4
address:
sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
984e27a4a615 niclaslindstedt/nquakesv "/entrypoint.sh" 47 seconds ago Up 46 seconds 0.0.0.0:27501->27500/tcp, 0.0.0.0:27501->27500/udp silly_satoshi
3885a4179fce niclaslindstedt/nquakesv "/entrypoint.sh" 12 hours ago Up 12 hours 10.138.0.56:27500->27500/tcp, 10.138.0.56:27500->27500/udp mvdsv-ddf45646-efb3-469e-58eb-b9919951969c
My attempts to enable host networking on the service and on the client stanza have all failed to produce the desired result: 0.0.0.0:27500->27500/tcp, 0.0.0.0:27500->27500/udp
. In fact trying to enable host networking has yielded me the even worse result 127.0.0.1:27500->27500/tcp, 127.0.0.1:27500->27500/udp
The first container I had to start manually to get the networking configuration I wanted so I could configure a google_compute_forwarding_rule
and google_compute_target_pool
in the usual way. At the moment the only way I can expose mvdsv-ddf45646-efb3-469e-58eb-b9919951969c
is by pointing a DNS record at it.
@AdrienneCohea Which Nomad version are you on? While I do still have some issues with UDP specifically that I haven't tracked down deep enough to conclde that there's any problem with Nomad, host networking seems to work OK since v1.0.2
.
The caveat is pretty significant though: I fyou have clienmts with host networks defined, you NEED to specifcy host_network
for EVERY port/service allocatewd on that client, lest you get undeterministic behavior. (We had jobs that were working just fine withoiut any host networking, which broke oncve we introduced host networks on the client).
Maybe you can share a reproducible job file and relevant parts of nomad client config?
I do recall using CIDRs was still broken in v1.0.0, not sure if that's fixed already, the above is refering to defining host networks by interfaces.
Thanks so much for responding! :tulip: It turns out that the true answer is that I may need to continue learning Google networking. I still wish I could get a WAN address that wasn't the RFC 1918, but I was able to get the following to work on a bare GCP VM instance with Nomad 1.0.3 just fine:
job "games" {
datacenters = ["dc1"]
group "quake" {
network {
mode = "bridge"
port "quake" {
static = 27500
to = 27500
}
}
task "mvdsv" {
driver = "docker"
config {
image = "niclaslindstedt/nquakesv"
ports = ["quake"]
}
env {
RCON_PASSWORD = "somethingrandom"
}
service {
tags = ["quake", "public-facing"]
port = "quake"
address_mode = "host"
}
resources {
cpu = 100
memory = 64
}
}
}
}
Bridge networking will also accomplish what I need, and I can also use an external UDP loadbalancer with it. I will mess around more and see if I can get rid of the static
port but leave the to
. But this is enough messing around for tonight.
cat /etc/nomad/*.hcl
client {
enabled = true
}
consul {
address = "127.0.0.1:8501"
ssl = true
ca_file = "/etc/consul/ca.pem"
cert_file = "/etc/consul/tls.crt"
key_file = "/etc/consul/tls.key"
token = "REDACTED"
}
acl {
enabled = true
}
data_dir = "/var/lib/nomad"
tls {
http = true
rpc = true
ca_file = "/etc/nomad/ca.pem"
cert_file = "/etc/nomad/tls.crt"
key_file = "/etc/nomad/tls.key"
}
A bit OT, but which LB are you using for udp? Envoy seems like it should be possible and I should give it a try, but otherwise the only one I have had anything close to working is nginx. (haproxy flat out doesn't support it, Traefik's UDP support is completely broken)
A forwarding rule and a target pool. I was trying to use Consul itself, but ingress gateways don't support UDP, and it's not supported in Istio either. I was joking on Twitter that UDP ingress seems to be a millennium prize problem in cloud computing.
I haven't thought about Envoy though (was hoping to have Consul manage that for me). Maybe if I learn to configure it manually (I mean via a Nomad job definition), I could accomplish it... :thinking:
Yeah, Consul Connect doesn't seem to support UDP with Nomad at all - I was similarly a bit taken back at just how much custom plumbing is needed. If you figure something that works out it would be awesome if you shared it with the community!
@petrukngantuk I think you're talking about something different here?
In some clusters IP addresses are dynamically assigned and reassigned to different hosts. The initial version of Multi-Host Interface networks only supports discovery of addresses that are bound to an interface when Nomad is started. This is because the network fingerprinter is only ran once on startup.
Unfortunately the problem is more nuanced than just periodically fingerprinting network interfaces. In some cases the floating address is never assigned to an interface and is managed by other networking tools and products. For example, GCP load balancers can use direct routing where the incoming packet's destination address is unchanged and is the address of the GCP managed loadbalancer and thus would never be fingerprinted by Nomad.
Nomad
host_network
configuration should support a way of configuring a virtual address that can be reported to the servers and used in scheduling.