hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.87k stars 1.95k forks source link

Docker service registration for IPv6 addresses fails #6412

Closed tsujp closed 2 weeks ago

tsujp commented 5 years ago

Nomad version

Nomad v0.9.5 (1cbb2b9a81b5715be2f201a4650293c9ae517b87)

Docker version

Client: Docker Engine - Community
 Version:           19.03.2
 API version:       1.40
 Go version:        go1.12.8
 Git commit:        6a30dfc
 Built:             Thu Aug 29 05:28:55 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.8
  Git commit:       6a30dfc
  Built:            Thu Aug 29 05:27:34 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Docker info

Client:
 Debug Mode: false

Server:
 Containers: 9
  Running: 1
  Paused: 0
  Stopped: 8
 Images: 5
 Server Version: 19.03.2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 3.10.0-1062.1.1.el7.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.764GiB
 Name: centos-4gb-nbg1-1
 ID: 4GGR:OTVS:KB62:5XVL:LZ6P:Y6BK:HO77:7T5K:AWLF:O2HO:IELL:32XC
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Operating system and Environment details

None applicable, nothing but a barebones install of Nomad, Docker, and a basic hello world container.

Issue

Nomad does not show the service IP of the Docker container, yet Docker reports the container IP via direct inspection of the properties the Service stanza should be grabbing the IP from.

problem

Reproduction steps

  1. Run nomad agent -dev after fresh install, no other nomad configuration.
  2. Install Docker, only set the following in /etc/docker/daemon.json
    {
    "ipv6": true,
    "fixed-cidr-v6": "2a01:4f8:c2c:bafc::5/118"
    }
  3. Run the job.
  4. Observe no IPv4 display for Addresses on the CLI or on the Nomad UI.
  5. Verify the Docker container actually has an address via
    $ sudo docker inspect --format '{{ .NetworkSettings.GlobalIPv6Address }}' server-137439d3-ab43-b9a5-fc3f-9dec45113529
    2a01:4f8:c2c:bafc::4

Job file (if appropriate)

job "ip6-echo" {
  datacenters = ["dc1"]

  group "example" {
    task "server" {
      driver = "docker"

      config {
        image = "localhost:5000/express-test"
        advertise_ipv6_address = true
      }

      resources {
        cpu    = 500 # 500 MHz
        memory = 256 # 256MB
        network {
          mbits = 10
        }
      }

      service {
        name = "ip6-echo-service"
        port = 5555
        address_mode = "driver"
      }
    }
  }
}

Nomad Job Allocation

ID                  = 137439d3
Eval ID             = 01fb588f
Name                = ip6-echo.example[0]
Node ID             = 51a7d8d3
Node Name           = centos-4gb-nbg1-1
Job ID              = ip6-echo
Job Version         = 824633981392
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 2m7s ago
Modified            = 1m52s ago

Task "server" is "running"
Task Resources
CPU        Memory           Disk     Addresses
0/500 MHz  6.3 MiB/256 MiB  300 MiB  

Task Events:
Started At     = 2019-10-02T02:51:38Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                       Type        Description
2019-10-02T04:51:38+02:00  Started     Task started by client
2019-10-02T04:51:33+02:00  Driver      Downloading image
2019-10-02T04:51:33+02:00  Task Setup  Building Task Directory
2019-10-02T04:51:33+02:00  Received    Task received by client

Host Interfaces from ip a s

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 96:00:00:31:a9:b6 brd ff:ff:ff:ff:ff:ff
    inet 116.203.217.144/32 brd 116.203.217.144 scope global dynamic eth0
       valid_lft 84999sec preferred_lft 84999sec
    inet6 2a01:4f8:c2c:bafc::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::9400:ff:fe31:a9b6/64 scope link 
       valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:16:a1:34:14 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 2a01:4f8:c2c:bafc::1/118 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link 
       valid_lft forever preferred_lft forever
    inet6 fe80::42:16ff:fea1:3414/64 scope link 
       valid_lft forever preferred_lft forever
26: vethaf10a9c@if25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 4e:b0:9c:80:84:d7 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::4cb0:9cff:fe80:84d7/64 scope link 
       valid_lft forever preferred_lft forever
28: vetha81619d@if27: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 72:ed:be:ca:64:12 brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::70ed:beff:feca:6412/64 scope link 
       valid_lft forever preferred_lft forever

Docker network inspect bridge

[
    {
        "Name": "bridge",
        "Id": "3c6c902a61f038d48a867f22eb60f20a50de8bf988395d63dbc7257051b5a67f",
        "Created": "2019-10-02T04:36:33.038398281+02:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": true,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16",
                    "Gateway": "172.17.0.1"
                },
                {
                    "Subnet": "2a01:4f8:c2c:bafc::/118"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "5ba1b8dbc3ce45040102b00ae862884beef1eb88555e403b470e7e7c6b20078a": {
                "Name": "server-137439d3-ab43-b9a5-fc3f-9dec45113529",
                "EndpointID": "eca6a39891019ceaef7715ce03b42e7fe3389c62fb098cb2d82497fa001e31c8",
                "MacAddress": "02:42:ac:11:00:04",
                "IPv4Address": "172.17.0.4/16",
                "IPv6Address": "2a01:4f8:c2c:bafc::4/118"
            },
            "673b0c4e17d9288b450e062ceba399245fb1729d6a81e13c1e0f825d1c2eacf5": {
                "Name": "registry",
                "EndpointID": "e7eb069231160f6e89627b37ac7b86b2d0eb93fd20c00283956027a4dbeff098",
                "MacAddress": "02:42:ac:11:00:03",
                "IPv4Address": "172.17.0.3/16",
                "IPv6Address": "2a01:4f8:c2c:bafc::3/118"
            }
        },
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]

Notes

I have tested this works (the Docker container is reachable) via the command curl -g -6 'http://[2a01:4f8:c2c:bafc::4]:5555/' being able to be run from anywhere and returning hello world.

$ curl -g -6 'http://[2a01:4f8:c2c:bafc::4]:5555/'
Hello World!

The image is a barebones nodejs http server, nothing special

var http = require('http');

var server = http.createServer(function (request, response) {
   response.writeHead(200, {"Content-Type":"text/plain"});
   response.end ("Hello World!");
   console.log("Got a connection");
});

server.listen(5555, '::');

console.log("Server running on localhost at port 80");

The Dockerfile for the image is

FROM node:12.10.0-stretch

RUN npm install

COPY --chown=node:node . .

EXPOSE 5555

CMD ["node", "index.js"]

docker ps -a gives

CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS              PORTS                    NAMES
5ba1b8dbc3ce        localhost:5000/express-test   "docker-entrypoint.s…"   4 minutes ago       Up 4 minutes        5555/tcp                 server-137439d3-ab43-b9a5-fc3f-9dec45113529
673b0c4e17d9        registry:2                    "/entrypoint.sh /etc…"   11 minutes ago      Up 11 minutes       0.0.0.0:5000->5000/tcp   registry
schmichael commented 5 years ago

Thanks for the details!

If you have jq installed can you paste the output of:

nomad node status -self -json | jq .Resources.Networks

I'm afraid Nomad is scheduling onto the IPv4 address first, although I'm unsure why that address isn't being displayed in the CLI output. Unfortunately you can only specify which network_interface Nomad should schedule on, but not choose which IP address on that interface is preferred.

Luckily Docker is doing the right thing and your service is available on the intended IPv6 address.

The IPv6 address should be advertised in Consul as well. Could you paste the output of:

curl -s localhost:8500/v1/catalog/service/redis-cache | jq -r .[].ServiceAddress

Fixes

I believe there are 2 things we need to do:

  1. Fix the address display issue. Not sure what's happening there!
  2. Allow selecting specific addresses or address classes for scheduling.
schmichael commented 5 years ago

This is related to #5862 and #3285

tsujp commented 4 years ago

Sorry I haven't had time to test this yet, so I've held off on replying but I felt I should let you know I still plan to give your steps a run @schmichael I just have another backlog of work to get through – I've had to apply some workarounds as I cannot use Nomad until this works. An alternative might be Podman (with IPv6 of course) but I am unfamiliar with Drivers for Nomad and how to extend them.

42wim commented 4 years ago

@tsujp I'm using the same setup for a while now. The address should be published in consul. It's only a nomad UI issue not showing the IPv6 address, but besides that everything should work.

tsujp commented 4 years ago

Thanks for the details!

If you have jq installed can you paste the output of:

nomad node status -self -json | jq .Resources.Networks

I'm afraid Nomad is scheduling onto the IPv4 address first, although I'm unsure why that address isn't being displayed in the CLI output. Unfortunately you can only specify which network_interface Nomad should schedule on, but not choose which IP address on that interface is preferred.

Luckily Docker is doing the right thing and your service is available on the intended IPv6 address.

The IPv6 address should be advertised in Consul as well. Could you paste the output of:

curl -s localhost:8500/v1/catalog/service/redis-cache | jq -r .[].ServiceAddress

Fixes

I believe there are 2 things we need to do:

1. Fix the address display issue. Not sure what's happening there!

2. Allow selecting specific addresses or address classes for scheduling.

(1)

nomad node status -self -json | jq .Resources.Networks
[
  {
    "CIDR": "127.0.0.1/32",
    "Device": "lo",
    "DynamicPorts": null,
    "IP": "127.0.0.1",
    "MBits": 1000,
    "Mode": "",
    "ReservedPorts": null
  },
  {
    "CIDR": "::1/128",
    "Device": "lo",
    "DynamicPorts": null,
    "IP": "::1",
    "MBits": 1000,
    "Mode": "",
    "ReservedPorts": null
  }
]

(2)

No output.

EDIT: for (2) I didn't have Consul running at all, I quickly skimmed the docs to get something up (for a ninja edit) and the curl command still returns nothing, at it's base it's

$ curl -s localhost:8500
<a href="/ui/">Moved Permanently</a>.

When I view the Consul UI and go to http://localhost:8500/ui/dc1/services/ip6-echo-service I can see the IPv6 address of the service, yay!

I assume as @42wim states this means everything is working under the hood (as how else is it going to determine the correct address) but that it's just a Nomad UI error in both the Web UI and CLI @schmichael

dionjwa commented 2 years ago

I have this same issue. Setup: running the dev version of nomad on my local mac laptop.

On some wifi networks, the unique IP address is IPv6:

image

When that happens, my tasks fail:

image

nomad node status -self -json | jq .Resources.Networks gives:

[
  {
    "CIDR": "fd67:2a00:1152:1:6a:8c9e:b809:7d5f/128",
    "DNS": null,
    "Device": "en0",
    "DynamicPorts": null,
    "Hostname": "",
    "IP": "fd67:2a00:1152:1:6a:8c9e:b809:7d5f",
    "MBits": 1000,
    "Mode": "host",
    "ReservedPorts": null
  },
  {
    "CIDR": "2406:5a00:1011:7b00:186b:fea3:d955:36b8/128",
    "DNS": null,
    "Device": "en0",
    "DynamicPorts": null,
    "Hostname": "",
    "IP": "2406:5a00:1011:7b00:186b:fea3:d955:36b8",
    "MBits": 1000,
    "Mode": "host",
    "ReservedPorts": null
  },
  {
    "CIDR": "2406:5a00:1011:7b00:64a5:2481:bc7e:22b3/128",
    "DNS": null,
    "Device": "en0",
    "DynamicPorts": null,
    "Hostname": "",
    "IP": "2406:5a00:1011:7b00:64a5:2481:bc7e:22b3",
    "MBits": 1000,
    "Mode": "host",
    "ReservedPorts": null
  },
  {
    "CIDR": "192.168.4.55/32",
    "DNS": null,
    "Device": "en0",
    "DynamicPorts": null,
    "Hostname": "",
    "IP": "192.168.4.55",
    "MBits": 1000,
    "Mode": "host",
    "ReservedPorts": null
  }
]

I would prefer to force the nomad client to use the IPv4 IP address but it is choosing IPv6 Tasks are fine when an IPv4 address is bound.

It looks like part of the issue is that macos Desktop Docker doesn't support IPv6, so nomad should be aware of this and bind accordingly: https://github.com/docker/for-mac/issues/1432

gulducat commented 2 weeks ago

Hey all, this has been open a while, and various things have changed over the years. I'll try to summarize them here before closing out the issue.

1. IPv6 not showing in CLI / UI:

I can only reproduce this happening one way, which is also today true of IPv4 addresses, with this config:

job "job" {
  group "grp" {
    task "tsk" {
      driver = "docker"
      config = {...}
      service {
        name = "web"
        port = 8000
      }
    }
  }
}

Notes: the service{} must be defined within task{} and the port must be a number (which is only allowed in a task.service{}). Nomad doesn't have an associated group.network.port{} block to associate the allocation to an IP; it's merely forwarding along service info to the service provider. As with Consul, if the service{ provider = "nomad" }, then the IPv4/6 address provided by Docker can be seen as a Nomad service, but otherwise it's effectively discarded.

The proper format for this nowadays would be:

job "job" {
  group "grp" {
    network {
      port "http" {
        static = 8000
      }
    }
    service {
      name = "web"
      port = "http"
      provider = "nomad" # or default "consul"
    }
    task "tsk" {
      driver = "docker"
      config = {
        ...
        ports = ["http"]
      }
    }
  }
}

This makes the static port available to Nomad for scheduling (to prevent port collisions), and it's passed down to service and task by its label "http" rather than number. Both network and service are group-scoped, because that's the unit that gets placed as an allocation, all the tasks together in one network. The main downside here is that it doesn't pay any mind to Docker's assigned IP(s). Instead, a host IP:Port combo will be selected by Nomad, then bound by Docker.

Which brings us to IPv6:

2. Ability to prefer an address family when scheduling

Instead of asking Docker to provide an IPv6, you can now (as of #23388 -> Nomad 1.8.2) set preferred_address_family = "ipv6" in the client{} config. That will sort the host's addresses by family and try to choose the configured preference for scheduling and advertising services. This is especially nice because it works for other task drivers, too.

E.g. My laptop has these addresses:

$ nomad node status -json -self | jq -r '.NodeResources.Networks[].CIDR'
192.168.1.201/32
2600:4040:1229:f00:a860:15dd:xx:xx/128
2600:4040:1229:f00:f068:ed9e:xx:xx/128

By default, my services would get that 192... IPv4 address, just because of how the addresses were sorted:

$ nomad service info web
Job ID  Address             Tags  Node ID   Alloc ID
dock    192.168.1.201:8000  []    c9ec5d97  96ecee71

$ nomad alloc status 96ecee71 | grep -B2 http
Allocation Addresses:
Label  Dynamic  Address
*http  yes      192.168.1.201:8000

("web" service and "http" port label are from my example jobspec in part 1 above)

With client { preferred_address_family = "ipv6" } in my agent config, I get:

$ nomad service info web
Job ID  Address                                        Tags  Node ID   Alloc ID
dock    [2600:4040:1229:f00:a860:15dd:xx:xx]:8000  []    c9ec5d97  d13ce1a9

$ nomad alloc status d13ce1a9 | grep -B2 http
Allocation Addresses:
Label  Dynamic  Address
*http  yes      2600:4040:1229:f00:a860:15dd:xx:xx:8000

Note: If you use network { mode = "bridge" } (the default is "host"), then you will also need to configure Nomad for that (Nomad 1.9+) - see #14101.

Finally, for folks with the opposite issue, to avoid IPv6, you may try setting preferred_address_family = "ipv4".


So with all that, I'm closing this out, but please let us know if we missed something important!