hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
15k stars 1.96k forks source link

duplicate `task.config.extra_hosts` to Connect sidecar tasks #11056

Open DejfCold opened 3 years ago

DejfCold commented 3 years ago

Nomad version

Output from nomad version Nomad v1.1.3 (8c0c8140997329136971e66e4c2337dfcf932692)

Operating system and Environment details

Rocky Linux 8.4 (Green Obsidian) Docker version 20.10.8, build 3967b7d

# nomad agent-info
client
  heartbeat_ttl = 12.695519612s
  known_servers = 192.168.0.206:4647
  last_heartbeat = 2.751359289s
  node_id = 3877b77f-6600-7fa3-eaea-6f1dc36d9128
  num_allocations = 27
nomad
  bootstrap = true
  known_regions = 1
  leader = true
  leader_addr = 192.168.0.206:4647
  server = true
raft
  applied_index = 21567
  commit_index = 21567
  fsm_pending = 0
  last_contact = 0
  last_log_index = 21567
  last_log_term = 26
  last_snapshot_index = 16628
  last_snapshot_term = 23
  latest_configuration = [{Suffrage:Voter ID:192.168.0.206:4647 Address:192.168.0.206:4647}]
  latest_configuration_index = 0
  num_peers = 0
  protocol_version = 2
  protocol_version_max = 3
  protocol_version_min = 0
  snapshot_version_max = 1
  snapshot_version_min = 0
  state = Leader
  term = 26
runtime
  arch = amd64
  cpu_count = 12
  goroutines = 2577
  kernel.name = linux
  max_procs = 12
  version = go1.16.5
serf
  coordinate_resets = 0
  encrypted = false
  event_queue = 0
  event_time = 1
  failed = 0
  health_score = 0
  intent_queue = 0
  left = 0
  member_time = 1
  members = 1
  query_queue = 0
  query_time = 1
vault
  token_expire_time = 2021-09-15T19:00:39+02:00
  token_ttl = 764h46m6s
  tracked_for_revoked = 0

Issue

task.extra_hosts is not propagated into the docker containers /etc/hosts

Reproduction steps

Expected Result

[root@63ec9326ae5a /]# cat /etc/hosts
# this file was generated by Nomad
127.0.0.1 localhost
::1 localhost
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

# this entry is the IP address and hostname of the allocation
# shared with tasks in the task group's network
172.26.65.239 63ec9326ae5a

# these entries are extra hosts added by the task config
127.0.0.1 freeipa.ingress.dc1.consul
[root@63ec9326ae5a /]# 

Actual Result

[root@63ec9326ae5a /]# cat /etc/hosts
# this file was generated by Nomad
127.0.0.1 localhost
::1 localhost
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

# this entry is the IP address and hostname of the allocation
# shared with tasks in the task group's network
172.26.65.239 63ec9326ae5a
[root@63ec9326ae5a /]# 

Job file (if appropriate)

job "freeipa" {
    datacenters = ["dc1"]
    group "freeipa" {
        network {
            mode = "bridge"
        }
        service {
            name = "freeipa"
            port = "443"
            connect {
                sidecar_service {}
            }
        }
        task "freeipa" {
            resources {
                memory = 2000
            }
            driver = "docker"
            config {
                image = "freeipa/freeipa-server:centos-8"
                args = [ "ipa-server-install", "-U", "-r", "DC1.CONSUL", "--no-ntp" ]
                sysctl = {
                    "net.ipv6.conf.all.disable_ipv6" = "0"
                }
                extra_hosts = ["freeipa.ingress.dc1.consul:127.0.0.1"]
            }
            env {
                HOSTNAME = "freeipa.ingress.dc1.consul"
                PASSWORD = "testtest"
            }
        }
    }
}

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

See also https://github.com/hashicorp/nomad/issues/7746#issuecomment-898945862

lgfa29 commented 3 years ago

Thank you for the report @DejfCold

I was able to reproduce the issue as you described, and it seems to only happen when using Consul Connect (commenting out the connect block from your service made /etc/hosts look right) so I've updated the title to highlight this.

ollivainola commented 3 years ago

I also encountered the problem in our cluster (using Nomad 1.1.3 & Consul 1.10.1). As for the workaround for now I created dnsmasq service with the extra_host parameters

job "dnsmasq" {
  datacenters = ["dc1"]
  type        = "service"

  group "dns" {
    count = 1

    network {
      mode = "bridge"
      port "dns" {
          static = 53
          to = 53
      }
    }

    service {
      name = "dnsmasq"
      port = "53"
    }

    task "dnsmasq" {
      driver = "docker"

      config {
        entrypoint = ["dnsmasq", "-d"]
        image = "strm/dnsmasq:latest"
        volumes = [
            "local/dnsmasq.conf:/config/dnsmasq.conf"
        ]
        extra_hosts = [
          "my.extra.domain:127.0.0.1",
          "my2.extra.domain:127.0.0.1"
        ]
      }
      template {
        data = <<EOF
#log all dns queries
log-queries

#dont use hosts nameservers
no-resolv
EOF
        destination = "local/dnsmasq.conf"
      }
    }
  }
}

and in the job, which needs to extra_hosts parameter, I removed the extra_hosts and included the dns configuration in network stanza:

    network {
      mode = "bridge"
      dns {
        servers = ["<HOST_IP>:53"]
      }
    }
lgfa29 commented 3 years ago

@ollivainola are you using Consul Connect?

ollivainola commented 3 years ago

@lgfa29 yes. The task where where I used the extra_hosts uses Consul Connect. I was previously using Nomad 1.1.2 and Consul 1.10.0 where extra_hosts was working just fine with Consul Connect. For me extra_hosts stopped working after I upgraded the cluster to the newer versions.

lgfa29 commented 3 years ago

Thanks for the extra info @ollivainola.

I've confirmed that https://github.com/hashicorp/nomad/pull/10823 broke extra_hosts with Consul Connect because the /etc/hosts file is now being shared with all tasks in the alloc. Since the Connect sidecar doesn't have any extra_hosts, it will generate an /etc/hosts without them.

lgfa29 commented 3 years ago

~@DejfCold I thought I had a quick fix for this, but this problem is actually more tricky than it looks.~

~As a workaround, you could leverage the fact that /etc/hosts is now shared between tasks in the same alloc and manually add entries in prestart task. So something like this:~

job "countdash" {
  # ...
  group "dashboard" {
    # ...
    task "extra-hosts" {
      driver = "docker"

      config {
        image   = "busybox:1.33"
        command = "/bin/sh"
        args    = ["local/extra_hosts.sh"]
      }

      template {
        data        = <<EOF
cat <<EOT >> /etc/hosts
127.0.0.1 freeipa.ingress.dc1.consul
EOT
EOF
        destination = "local/extra_hosts.sh"
      }

      lifecycle {
        hook = "prestart"
      }
    }
  }
}

~It's not great, but hopefully it will work for now.~

~It's also worth pointing out that this workaround suffers from the same issue that my naive fix, which is a possible race condition between tasks trying to update /etc/hosts in parallel. Maybe a sleep in the script could help, or a loop that blocks until the Connect sidecar proxy is running.~

EDIT:

Scratch all of that 😬

A better workaround would be to set the extra_hosts at the Connect sidecar task instead of your main task, so something like this from your example:

job "freeipa" {
  datacenters = ["dc1"]
  group "freeipa" {
    network {
      mode = "bridge"
    }
    service {
      name = "freeipa"
      port = "443"
      connect {
        sidecar_service {}
+       sidecar_task {
+         config {
+           extra_hosts = ["freeipa.ingress.dc1.consul:127.0.0.1"]
+         }
+       }
      }
    }
    task "freeipa" {
      resources {
        memory = 2000
      }
      driver = "docker"
      config {
        image = "freeipa/freeipa-server:centos-8"
        args  = ["ipa-server-install", "-U", "-r", "DC1.CONSUL", "--no-ntp"]
        sysctl = {
          "net.ipv6.conf.all.disable_ipv6" = "0"
        }
-       extra_hosts = ["freeipa.ingress.dc1.consul:127.0.0.1"]
      }
      env {
        HOSTNAME = "freeipa.ingress.dc1.consul"
        PASSWORD = "testtest"
      }
    }
  }
}
DejfCold commented 3 years ago

Thanks for the workaround! I closed the issue, but now thinking about it, not sure if I should have left that to you?

lgfa29 commented 3 years ago

Hum...good question. Even though there's a reasonable workaround I think we still need to provide a proper fix for this, so I will keep it open for now 👍

nahsi commented 3 years ago

When setting host.docker.internal:host-gatewayin sidecar_task config I get this failed to build mount for /etc/hosts: invalid IP address "host.docker.internal:host-gateway"

mehdiMj-ir commented 1 year ago

Thank you very much, I had problem adding extra host on /etc/hosts in docker container and with your comment it's all set.

I need these extra hosts for my TLS setting to address to specific random private domain.

tgross commented 5 months ago

While this is working roughly as intended, I'm going to re-title this and label it as an enhancement. There's probably some discussion to be had about whether the extra_hosts should always be duplicated or not, but I'll leave that up to whomever picks this up for implementation to figure out. :grinning: