hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.83k stars 1.95k forks source link

`host_network` should support floating/virtual IP addresses #8577

Open nickethier opened 4 years ago

nickethier commented 4 years ago

In some clusters IP addresses are dynamically assigned and reassigned to different hosts. The initial version of Multi-Host Interface networks only supports discovery of addresses that are bound to an interface when Nomad is started. This is because the network fingerprinter is only ran once on startup.

Unfortunately the problem is more nuanced than just periodically fingerprinting network interfaces. In some cases the floating address is never assigned to an interface and is managed by other networking tools and products. For example, GCP load balancers can use direct routing where the incoming packet's destination address is unchanged and is the address of the GCP managed loadbalancer and thus would never be fingerprinted by Nomad.

Nomad host_network configuration should support a way of configuring a virtual address that can be reported to the servers and used in scheduling.

ghost commented 3 years ago

It's really sad to see that this is considered an enhancement as it was brought up several times as a usecase in #646 by @skozin / @rkno82

Why can't 0.0.0.0/0 be consideret as a valid cidr and then bypass interface fingerprinting on that mask

ghost commented 3 years ago

@nickethier Any updates? Our usecase is still blocked

AdrienneCohea commented 3 years ago

I would like to echo that I am puzzled.

Nomad is incredibly easy to run in DigitalOcean and a sobbing frustration in Google Cloud Platform. The reason for that is that in DigitalOcean you can get a VM that has a network interface exposed to the public network, so if you want to expose your service to the internet it's pretty easy. VMs in Google Compute Engine do not work that way. You get only a private interface with an RFC 1918 address:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000
    link/ether 42:01:0a:8a:00:38 brd ff:ff:ff:ff:ff:ff
    inet 10.138.0.56/32 scope global dynamic ens4
       valid_lft 52864sec preferred_lft 52864sec
    inet6 fe80::4001:aff:fe8a:38/64 scope link 
       valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:1a:e4:e0:40 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:1aff:fee4:e040/64 scope link 
       valid_lft forever preferred_lft forever
7: vethb75d369@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether fe:d7:96:26:6c:2f brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::fcd7:96ff:fe26:6c2f/64 scope link 
       valid_lft forever preferred_lft forever
13: veth62b88d1@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether d2:6a:75:7f:5d:0c brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::d06a:75ff:fe7f:5d0c/64 scope link 
       valid_lft forever preferred_lft forever

When public networking is allowed to the instance, I am able to hit a service via the NAT address that Google assigns to it. However, what I am not able to do is use a TCP/UDP loadbalancer to point at the service, because the service is bound to the ens4 address:

sudo docker ps
CONTAINER ID        IMAGE                      COMMAND             CREATED             STATUS              PORTS                                                        NAMES
984e27a4a615        niclaslindstedt/nquakesv   "/entrypoint.sh"    47 seconds ago      Up 46 seconds       0.0.0.0:27501->27500/tcp, 0.0.0.0:27501->27500/udp           silly_satoshi
3885a4179fce        niclaslindstedt/nquakesv   "/entrypoint.sh"    12 hours ago        Up 12 hours         10.138.0.56:27500->27500/tcp, 10.138.0.56:27500->27500/udp   mvdsv-ddf45646-efb3-469e-58eb-b9919951969c

My attempts to enable host networking on the service and on the client stanza have all failed to produce the desired result: 0.0.0.0:27500->27500/tcp, 0.0.0.0:27500->27500/udp. In fact trying to enable host networking has yielded me the even worse result 127.0.0.1:27500->27500/tcp, 127.0.0.1:27500->27500/udp

The first container I had to start manually to get the networking configuration I wanted so I could configure a google_compute_forwarding_rule and google_compute_target_pool in the usual way. At the moment the only way I can expose mvdsv-ddf45646-efb3-469e-58eb-b9919951969c is by pointing a DNS record at it.

3nprob commented 3 years ago

@AdrienneCohea Which Nomad version are you on? While I do still have some issues with UDP specifically that I haven't tracked down deep enough to conclde that there's any problem with Nomad, host networking seems to work OK since v1.0.2.

The caveat is pretty significant though: I fyou have clienmts with host networks defined, you NEED to specifcy host_network for EVERY port/service allocatewd on that client, lest you get undeterministic behavior. (We had jobs that were working just fine withoiut any host networking, which broke oncve we introduced host networks on the client).

Maybe you can share a reproducible job file and relevant parts of nomad client config?


I do recall using CIDRs was still broken in v1.0.0, not sure if that's fixed already, the above is refering to defining host networks by interfaces.

AdrienneCohea commented 3 years ago

Thanks so much for responding! :tulip: It turns out that the true answer is that I may need to continue learning Google networking. I still wish I could get a WAN address that wasn't the RFC 1918, but I was able to get the following to work on a bare GCP VM instance with Nomad 1.0.3 just fine:

job "games" {
  datacenters = ["dc1"]

  group "quake" {
    network {
      mode = "bridge"

      port "quake" {
        static = 27500
        to     = 27500
      }
    }

    task "mvdsv" {
      driver = "docker"

      config {
        image = "niclaslindstedt/nquakesv"

        ports = ["quake"]
      }

      env {
        RCON_PASSWORD = "somethingrandom"
      }

      service {
        tags = ["quake", "public-facing"]
        port = "quake"
        address_mode = "host"
      }

      resources {
        cpu    = 100
        memory = 64
      }
    }
  }
}

Bridge networking will also accomplish what I need, and I can also use an external UDP loadbalancer with it. I will mess around more and see if I can get rid of the static port but leave the to. But this is enough messing around for tonight.

cat /etc/nomad/*.hcl
client {
  enabled = true
}

consul {
  address   = "127.0.0.1:8501"
  ssl       = true
  ca_file   = "/etc/consul/ca.pem"
  cert_file = "/etc/consul/tls.crt"
  key_file  = "/etc/consul/tls.key"
  token     = "REDACTED"
}

acl {
  enabled = true
}
data_dir = "/var/lib/nomad"
tls {
  http = true
  rpc  = true

  ca_file   = "/etc/nomad/ca.pem"
  cert_file = "/etc/nomad/tls.crt"
  key_file  = "/etc/nomad/tls.key"
}
3nprob commented 3 years ago

A bit OT, but which LB are you using for udp? Envoy seems like it should be possible and I should give it a try, but otherwise the only one I have had anything close to working is nginx. (haproxy flat out doesn't support it, Traefik's UDP support is completely broken)

AdrienneCohea commented 3 years ago

A forwarding rule and a target pool. I was trying to use Consul itself, but ingress gateways don't support UDP, and it's not supported in Istio either. I was joking on Twitter that UDP ingress seems to be a millennium prize problem in cloud computing.

I haven't thought about Envoy though (was hoping to have Consul manage that for me). Maybe if I learn to configure it manually (I mean via a Nomad job definition), I could accomplish it... :thinking:

3nprob commented 3 years ago

Yeah, Consul Connect doesn't seem to support UDP with Nomad at all - I was similarly a bit taken back at just how much custom plumbing is needed. If you figure something that works out it would be awesome if you shared it with the community!

Legogris commented 3 years ago

@petrukngantuk I think you're talking about something different here?