hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.84k stars 1.95k forks source link

Create network bridge and iptables chains at startup #6618

Open tgross opened 4 years ago

tgross commented 4 years ago

In https://github.com/hashicorp/nomad/issues/6567#issuecomment-549519686 we encountered a case where concurrency issues in the CNI plugins caused allocation failures for Connect-enabled jobs. There was a similar one fixed in https://github.com/containernetworking/plugins/pull/366

While we should and will help patch upstream, it might improve the user experience and reduce Nomad bug reports if we were to create the network bridge and iptables chains we need for Connect-enabled jobs at client startup, rather than waiting for a job allocation. This includes:

I'm not sure we have a great place to do this work on startup, but maybe @nickethier @shoenig or @schmichael have an idea?

apollo13 commented 4 years ago

I think I am running into this (I hope the logs and the job files might prove helpful -- sorry if that is a completly different issue):

Nomad version

Nomad v0.11.1 (b43457070037800fcc8442c8ff095ff4005dab33) CNI firewall|etc plugin v0.8.5

Operating system and Environment details

Debian 10

Issue

After rapid submission of two jobs that use groups (and in one instance connect) I get a failed setup of the job with the connect stanza (though that might have been by pure luck due to ordering etc…):

May 05 16:03:36 nomad01 nomad[2826]:     2020/05/05 16:03:36.555463 [INFO] (runner) creating new runner (dry: false, once: false)
May 05 16:03:36 nomad01 nomad[2826]:     2020/05/05 16:03:36.556254 [INFO] (runner) creating watcher
May 05 16:03:36 nomad01 nomad[2826]:     2020/05/05 16:03:36.556555 [INFO] (runner) starting
May 05 16:03:36 nomad01 nomad[2826]:     2020/05/05 16:03:36.559227 [INFO] (runner) rendered "(dynamic)" => "/opt/paas/data/nomad/alloc/99d58ccb-831e-c275-c9ff-ca344fb9077d/traefik/local/dynamic_conf.yml"
May 05 16:03:36 nomad01 nomad[2826]:     2020/05/05 16:03:36.675962 [INFO] (runner) rendered "(dynamic)" => "/opt/paas/data/nomad/alloc/99d58ccb-831e-c275-c9ff-ca344fb9077d/traefik/local/traefik.yml"
May 05 16:03:36 nomad01 nomad[2826]:     2020-05-05T16:03:36.762Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=66744447f60ecd06e2ec2a63ba393fbdd8a9c64ca97f84cde2c5b085b3e2efd5
May 05 16:03:37 nomad01 nomad[2826]:     2020-05-05T16:03:37.029Z [INFO]  client.driver_mgr.docker: started container: driver=docker container_id=66744447f60ecd06e2ec2a63ba393fbdd8a9c64ca97f84cde2c5b085b3e2efd5
May 05 16:04:31 nomad01 nomad[2826]:     2020-05-05T16:04:31.820Z [WARN]  client.alloc_runner.runner_hook: failed to configure bridge network: alloc_id=cb570e30-7241-a6b3-ae42-676dba36ab40 err="unable to create chain CNI-HOSTPORT-SETMARK: running [/usr/sbin/iptables -t nat -N CNI-HOSTPORT-SETMARK --wait]: exit status 4: iptables v1.8.2 (nf_tables):  CHAIN_USER_ADD failed (File exists): chain CNI-HOSTPORT-SETMARK
May 05 16:04:31 nomad01 nomad[2826]: " attempt=1
May 05 16:04:31 nomad01 nomad[2826]:     2020-05-05T16:04:31.948Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=2c225f95-080f-52a9-8407-f503ac4df3a3 task=connect-proxy-netbox-redis path=/opt/paas/data/nomad/alloc/2c225f95-080f-52a9-8407-f503ac4df3a3/alloc/logs/.connect-proxy-netbox-redis.stdout.fifo @module=logmon timestamp=2020-05-05T16:04:31.947Z
May 05 16:04:31 nomad01 nomad[2826]:     2020-05-05T16:04:31.948Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=2c225f95-080f-52a9-8407-f503ac4df3a3 task=connect-proxy-netbox-redis path=/opt/paas/data/nomad/alloc/2c225f95-080f-52a9-8407-f503ac4df3a3/alloc/logs/.connect-proxy-netbox-redis.stderr.fifo @module=logmon timestamp=2020-05-05T16:04:31.948Z
May 05 16:04:31 nomad01 nomad[2826]:     2020-05-05T16:04:31.965Z [INFO]  client.alloc_runner.task_runner.task_hook.consul_si_token: derived SI token: alloc_id=2c225f95-080f-52a9-8407-f503ac4df3a3 task=connect-proxy-netbox-redis task=connect-proxy-netbox-redis si_task=netbox-redis
May 05 16:04:32 nomad01 nomad[2826]:     2020-05-05T16:04:32.138Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=ce8ca7d938ff37b84d3690aaeb255e6fdc877f267944508a2b5222b75cacbbc0
May 05 16:04:32 nomad01 nomad[2826]:     2020-05-05T16:04:32.292Z [INFO]  client.driver_mgr.docker: started container: driver=docker container_id=ce8ca7d938ff37b84d3690aaeb255e6fdc877f267944508a2b5222b75cacbbc0
May 05 16:04:32 nomad01 nomad[2826]:     2020-05-05T16:04:32.327Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=2c225f95-080f-52a9-8407-f503ac4df3a3 task=redis @module=logmon path=/opt/paas/data/nomad/alloc/2c225f95-080f-52a9-8407-f503ac4df3a3/alloc/logs/.redis.stdout.fifo timestamp=2020-05-05T16:04:32.326Z
May 05 16:04:32 nomad01 nomad[2826]:     2020-05-05T16:04:32.327Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=2c225f95-080f-52a9-8407-f503ac4df3a3 task=redis @module=logmon path=/opt/paas/data/nomad/alloc/2c225f95-080f-52a9-8407-f503ac4df3a3/alloc/logs/.redis.stderr.fifo timestamp=2020-05-05T16:04:32.327Z
May 05 16:04:32 nomad01 nomad[2826]:     2020-05-05T16:04:32.398Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=563f35419d477ec9ded3b2775c639c6c9ed5850284cbad61a55c419590c21797
May 05 16:04:32 nomad01 nomad[2826]:     2020-05-05T16:04:32.563Z [INFO]  client.driver_mgr.docker: started container: driver=docker container_id=563f35419d477ec9ded3b2775c639c6c9ed5850284cbad61a55c419590c21797
May 05 16:04:33 nomad01 nomad[2826]:     2020-05-05T16:04:33.125Z [WARN]  client.alloc_runner.runner_hook: failed to configure bridge network: alloc_id=cb570e30-7241-a6b3-ae42-676dba36ab40 err="container veth name provided (eth0) already exists" attempt=2
May 05 16:04:34 nomad01 nomad[2826]:     2020-05-05T16:04:34.890Z [WARN]  client.alloc_runner.runner_hook: failed to configure bridge network: alloc_id=cb570e30-7241-a6b3-ae42-676dba36ab40 err="container veth name provided (eth0) already exists" attempt=3
May 05 16:04:34 nomad01 nomad[2826]:     2020-05-05T16:04:34.890Z [ERROR] client.alloc_runner: prerun failed: alloc_id=cb570e30-7241-a6b3-ae42-676dba36ab40 error="pre-run hook "network" failed: failed to configure networking for alloc: failed to configure bridge network: container veth name provided (eth0) already exists"
May 05 16:04:34 nomad01 nomad[2826]:     2020-05-05T16:04:34.910Z [INFO]  client.gc: marking allocation for GC: alloc_id=cb570e30-7241-a6b3-ae42-676dba36ab40
May 05 16:04:37 nomad01 nomad[2826]:     2020-05-05T16:04:37.202Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=9f58b0b8-161c-6693-c854-e1b5e729f05c task=crm path=/opt/paas/data/nomad/alloc/9f58b0b8-161c-6693-c854-e1b5e729f05c/alloc/logs/.crm.stdout.fifo @module=logmon timestamp=2020-05-05T16:04:37.202Z
May 05 16:04:37 nomad01 nomad[2826]:     2020-05-05T16:04:37.202Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=9f58b0b8-161c-6693-c854-e1b5e729f05c task=crm @module=logmon path=/opt/paas/data/nomad/alloc/9f58b0b8-161c-6693-c854-e1b5e729f05c/alloc/logs/.crm.stderr.fifo timestamp=2020-05-05T16:04:37.202Z

Reproduction steps

Not sure, I just submitted two job files (see below)

Job file (if appropriate)

File 1:

job "g.crm" {
  datacenters = ["dc1"]

  group "web" {
    network {
      mode = "bridge"
      port "crm" { to = 8000 }
    }

    service {
      name = "crm"
      port = "crm"
    }

    task "crm" {
      driver = "docker"

      config {
        image = "some_docker_image listening on 8000"
      }

      template {
        data = <<EOH
{{ key "bap/crm/env" }}
EOH

        destination = "secrets/file.env"
        env         = true
      }
    }
  }
}

File 2:

job "infra.netbox" {
  datacenters = ["dc1"]

  group "web" {

    network {
      mode = "bridge"
      port "netbox" { to = 8001 }
      port "static" { to = 80 }
    }

    service {
      name = "netbox-redis-connect"

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "netbox-redis"
              local_bind_port  = 6379
            }
          }
        }
      }
    }

    service {
      name = "netbox"
      port = "netbox"
      tags = [
          "traefik.enable=true",
          "traefik.http.routers.netbox.rule=Host(`netbox.somedomain.com`)"
      ]
    }

    service {
      name = "netbox-staticfiles"
      port = "static"
      tags = [
          "traefik.enable=true",
          "traefik.http.routers.netbox-staticfiles.rule=Host(`netbox.somedomain.com`) && PathPrefix(`/static/`)"
      ]
    }

    task "netbox" {
      driver = "docker"

      config {
        image = "netboxcommunity/netbox:v2.7.12"

        entrypoint = ["/local/entrypoint.sh"]
        command = "gunicorn"
        args = ["-c", "/etc/netbox/config/gunicorn_config.py", "netbox.wsgi"]
      }

      template {
        data = <<EOH
SKIP_STARTUP_SCRIPTS=true
{{ key "infra/netbox/env" }}
EOH

        destination = "secrets/file.env"
        env         = true
      }

      template {
        data = <<EOH
#!/bin/bash
set -e
mkdir -p /alloc/data/netbox-static
rm -rf /opt/netbox/netbox/static
ln -s /alloc/data/netbox-static /opt/netbox/netbox/static
exec /opt/netbox/docker-entrypoint.sh "$@"
EOH
        destination = "local/entrypoint.sh"
        perms = "755"
      }

    }

    task "static" {
      driver = "docker"

      config {
        image = "nginx:1.17-alpine"
        command = "/usr/sbin/nginx"
        args = ["-c", "/local/nginx.conf"]
      }

      resources {
        memory = 100
      }

      template {
        data = <<EOH
user  nginx;
worker_processes  auto;
daemon off;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    server {
      listen       80;
      server_name  localhost;

      location /static/ {
          alias   /alloc/data/netbox-static/;
          autoindex off;
      }

      error_page   500 502 503 504  /50x.html;
      location = /50x.html {
          root   /usr/share/nginx/html;
      }
  }

}
EOH
        destination = "local/nginx.conf"
      }
    }
  }

  group "redis" {
    network {
      mode = "bridge"
    }

    service {
      name = "netbox-redis"
      port = "6379"

      connect {
        sidecar_service {}
      }
    }

    task "redis" {
      driver = "docker"

      config {
        image = "redis:5-alpine"
      }

    }
  }

  group "netbox-worker" {
    network {
      mode = "bridge"
    }

    service {
      name = "netbox-worker"

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "netbox-redis"
              local_bind_port  = 6379
            }
          }
        }
      }
    }

    task "netbox-worker" {
      driver = "docker"

      config {
        image = "netboxcommunity/netbox:v2.7.12"

        entrypoint = ["python3", "/opt/netbox/netbox/manage.py"]
        command = "rqworker"
      }

      template {
        data = <<EOH
  SKIP_STARTUP_SCRIPTS=true
  {{ key "infra/netbox/env" }}
  EOH

        destination = "secrets/file.env"
        env         = true
      }

    }
  }
}
apollo13 commented 3 years ago

@nickethier Did you make any progress in this area? I rebooted a node today (without draining first) and it resulted in different weird errors (up to segfaults in nftables/iptables). Sadly I do not have logs of the previous reboots -- will see if I can gather more the next time.

tgross commented 2 years ago

Related https://github.com/hashicorp/nomad/issues/12103