hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.86k stars 1.95k forks source link

Cilium with nomad #12120

Open monwolf opened 2 years ago

monwolf commented 2 years ago

Hi, I'm trying to integrate Cilium with Nomad and its cni interface but as per lack of documentation, it started to become a hardy process. Until now, I found two issues in order to get I running.

First of all, let me add a little bit of context of my setup. We are running nomad, consul on top of Oracle Linux 8 and we are using docker-ce with namespaces and other options such as ACLs as part of the hardening.

I would like to deploy cilium as a nomad task, I've been able to set up most of the required pieces to do this. I'm using the pattern of post-start task (cilium-setup) to copy in the host the cni-plugin binary and its config.

job "cilium-system" {
  datacenters = ["ingress"]
  type        = "system"

  group "cilium-system" {

    network {
      port "http" {
        to = 4240
      }

      mode = "host"
    }

    task "cilium-setup" {
      lifecycle {
        hook = "poststart"
        sidecar = false
      }

      driver = "docker"
      config {
        security_opt = [
          "no-new-privileges"
        ]

        privileged = true
        userns_mode = "host"
        pids_limit = 200
        image = "cilium:1.9.12-RELEASE"
        entrypoint = ["sh"]

        args = [
           "/cni-install.sh"
        ]

        volumes = [
            "/etc/cni/net.d:/host/etc/cni/net.d",
            "/opt/cni/bin:/host/opt/cni/bin"
          ]
      }
      resources {
        memory = 20
      }
    }

    task "cilium" {
      vault {
        policies = ["policy-ro-sec"]
      }

      driver = "docker"
      env {
        CONSUL_HTTP_SSL = "true"
        CONSUL_CACERT = "/opt/consul/ssl/consul-ca.pem"
        CONSUL_CLIENT_CERT = "/opt/consul/ssl/server-${attr.unique.hostname}.pem"
        CONSUL_CLIENT_KEY = "/opt/consul/ssl/server-${attr.unique.hostname}-key.pem"

      }
      template {
        data = <<EORC
{{ with secret "/secret/sec/sec-cilium-system" }}
CONSUL_HTTP_TOKEN={{ .Data.CONSUL_HTTP_TOKEN }}
{{ end }}
EORC
        destination   = "/secrets/CONSUL_HTTP_TOKEN"
        env = true
      }

      service {
        name = "cilium-agent"
        check {
          name     = "cilium-agent alive"
          type     = "http"
          port     = "http"
          path     = "/hello"
          interval = "30s"
          timeout  = "5s" 
          check_restart {
            limit = "5"
            grace = "120s"
          }
          header {
              brief = ["true"]
          }
        }
      }

      config {
        security_opt = [
          "no-new-privileges"
        ]
        privileged = true
        userns_mode = "host"
        pids_limit = 200
        image = "cilium:1.9.12-RELEASE"
        ports = ["http"]

        entrypoint = ["cilium-agent"]
        args = [
           "--enable-ipv6=false",
           "--kvstore", "consul",
           "--kvstore-opt", "consul.address=${attr.unique.hostname}:8500", 
           "-t", "vxlan",
        ]

        volumes = [
          "/var/run/docker.sock:/var/run/docker.sock",
          "/var/run/cilium:/var/run/cilium",
          "/sys/fs/bpf:/sys/fs/bpf",
          "/var/run/docker/netns:/var/run/docker/netns:rshared",
          "/var/run/netns:/var/run/netns:rshared",
          "/opt/consul/ssl:/opt/consul/ssl:ro",
          "/etc/cni/net.d:/etc/cni/net.d",
          "/opt/cni/bin:/opt/cni/bin"
        ]

      }

      resources {
        memory = 100
      }
    }
    task "cilium-docker-plugin" {

      driver = "docker"

      config {
        security_opt = [
          "no-new-privileges"
        ]
        privileged = true
        userns_mode = "host"
        pids_limit = 200
        image = "cilium-docker-plugin:1.9.12-RELEASE"
        volumes = [
          "/var/run/docker.sock:/var/run/docker.sock",
          "/var/run/cilium:/var/run/cilium",
          "/run/docker/plugins:/run/docker/plugins"
        ]
      }

      resources {
        memory = 20
      }
    }
  }
}

The first problem I saw here it's I'm not able to use this plugin until I restart nomad. Is there any way to force the reload of this config from the API?

The second one is I'm not able to use the cni-configuration provided by cilium to make it work in nomad.

The cilium cni config file looks like:

{
  "cniVersion": "0.3.1",
  "name": "cilium",
  "type": "cilium-cni",
  "enable-debug": false
}

But when i restart nomad I see in the log:

fingerprinting failed: failed to load CNI config list file /opt/cni/conf/05-cilium.conflist: error parsing configuration list: no plugins in list

I tried to add the key plugins

{
  "cniVersion": "0.3.1",
  "name": "cilium",
  "type": "cilium-cni",
  "enable-debug": false,
  "plugins": []
}

But I still getting the error:

fingerprinting failed: failed to load CNI config list file /opt/cni/conf/05-cilium.conflist: error parsing configuration list: no plugins in list

Could you help me to address this?

lgfa29 commented 2 years ago

Hi @monwolf,

I'm not too familiar with CNI or Cilium, but looking at the config parsing code we a logic to parse the file differently depending on the file extension: https://github.com/hashicorp/nomad/blob/v1.2.6/client/fingerprint/cni.go#L37-L57

From your error message it seems like you need to set your file extension as .conf instead of .conflist? https://go.dev/play/p/mMo47nHcM6I

Give it a try like that and let us know how it goes πŸ™‚

monwolf commented 2 years ago

Hi @lgfa29 ,

I did the test again, now It fail during the allocation phase. During the start it load the cilium config but when I run the job it fails:

2022-02-25T08:22:07.295+0100 [DEBUG] client.fingerprint_mgr: built-in fingerprints: fingerprinters=["arch", "bridge", "cgroup", "cni", "consul", "cpu", "host", "memory", "network", "nomad", "signal", "storage", "vault", "env_gce", "env_azure", "env_aws"]
2022-02-25T08:22:07.314+0100 [DEBUG] client.fingerprint_mgr: detected CNI network: name=cilium
2022-02-25T08:22:15.410+0100 [DEBUG] client.fingerprint_mgr: detected fingerprints: node_attrs=["arch", "bridge", "cgroup", "cni", "consul", "cpu", "host", "network", "nomad", "signal", "storage", "vault"]
2022-02-25T08:22:20.016+0100 [ERROR] client.alloc_runner: prerun failed: alloc_id=86ebc0f2-d1e9-10c1-465e-5cf01a5eb647 error="pre-run hook \"network\" failed: failed to configure networking for alloc: cni config load failed: error parsing configuration list: no 'plugins' key: failed to load cni config"

This is the file:

[root@client conf]# ls /opt/cni/conf/
05-cilium.conf
[root@client conf]# cat /opt/cni/conf/05-cilium.conf 
{
  "cniVersion": "0.3.1",
  "name": "cilium",
  "type": "cilium-cni",
  "enable-debug": false
}

And this is the job definition:

job "helloworld" {
  region      = "global"
  datacenters = ["ingress"]
  type        = "service"
  priority    = 50

  update {
    stagger      = "10s"
    max_parallel = 1
  }

  group "helloworld" {
    count = 1
    network {
      mode="cni/cilium"
      port "http" {
        to = 80
      }
    }

    update {
      max_parallel     = 1
      min_healthy_time = "30s"
      healthy_deadline = "10m"
      progress_deadline = "11m"
      auto_revert      = true
    }

    restart {
      attempts = 10
      interval = "5m"
      delay    = "25s"
      mode     = "delay"
    }

    task "helloworld" {
      driver = "docker"

      config {

        security_opt = [
          "no-new-privileges"
        ]
        pids_limit = 20

        image =  "helloworld:1.0.0-RELEASE"
        ports = ["http"]

        force_pull = false
      }

      service {
        name = "helloworld"
        port = "http"
      }

      resources {
        memory = 20
      }

      logs {
        max_files     = 1
        max_file_size = 15
      }

      kill_timeout = "20s"
    }
  }
}

PS: I'm running Nomad v1.2.5

lgfa29 commented 2 years ago

Ah sorry @monwolf, I was reading the Cilium docs and I think I misunderstood the file format.

I think the file should look like this and be called cilium.conflist like you had before:

{
  "cniVersion": "0.3.1",
  "name": "cilium",
  "plugins": [
    {
      "type": "cilium-cni",
      "enable-debug": false
    }
  ]
}

With this file fingreprint should be successful:

    2022-02-25T16:39:59.256Z [DEBUG] client.fingerprint_mgr: built-in fingerprints: fingerprinters=["arch", "bridge", "cgroup", "cni", "consul", "cpu", "host", "memory", "network", "nomad", "signal", "storage", "vault", "env_aws", "env_gce", "env_azure", "env_digitalocean"]
    2022-02-25T16:39:59.256Z [INFO]  client.fingerprint_mgr.cgroup: cgroups are available
    2022-02-25T16:39:59.256Z [DEBUG] client.fingerprint_mgr: detected CNI network: name=cilium

Deploying Cilium itself may be a little more work. From the issue you linked it seems like Cilium would need a Kubernetes Operator to run as well?

Though I really hope you are able to get it to work, Cilium is a cool project πŸ™‚

monwolf commented 2 years ago

@lgfa29 It worked your suggestion, Thanks :)

The next step is solve a derivate problem:

2022-02-25T19:41:40.906+0100 [WARN]  client.alloc_runner.runner_hook: failed to configure network: alloc_id=5104ebca-15a8-f5a6-6eec-938993d265a4 err="Unable to create endpoint: Cilium API client timeout exceeded" attempt=2

But I'm happy to see this error πŸ‘―

My other question was If I deploy a nomad task with cilium. Do you know if I can refresh the fingerprint of the nomad agent to enable this config for other tasks?

Regarding if I need a Kubernetes operator, at this moment I don't know, I was basing my first steps on these docs:

https://docs.cilium.io/en/v1.9/gettingstarted/docker/

I need to go one by one solving the problems that appear because all together is quite a broad topic

lgfa29 commented 2 years ago

My other question was If I deploy a nomad task with cilium. Do you know if I can refresh the fingerprint of the nomad agent to enable this config for other tasks?

I'm not sure if understand the question πŸ€”

You will need that config JSON file in every client. I think the Cilium Docker image downloads its plugin automatically, but you may also need to have the CNI plugins in /opt/cni/bin.

I need to go one by one solving the problems that appear because all together is quite a broad topic πŸ‘

I hope you are able to get it working πŸ€

monwolf commented 2 years ago

I've been out of the office few days, but now I'm here again :D

I copied the pattern from HELM provided by cilium in order to install the latest version of CNI when the container starts:

    task "cilium-setup" {
      lifecycle {
        hook = "poststart"
        sidecar = false
      }

      driver = "docker"
      config {
        security_opt = [
          "no-new-privileges"
        ]

        privileged = true
        userns_mode = "host"
        pids_limit = 200
        image = "cilium:1.9.12-RELEASE"
        entrypoint = ["sh"]

        args = [
           "/cni-install.sh"
        ]

        volumes = [
            "/etc/cni/net.d:/host/etc/cni/net.d",
            "/opt/cni/bin:/host/opt/cni/bin"
          ]
      }
      resources {
        memory = 20
      }
    }

This will add a config and he cni binary inside the host filesystem, so my question is if there are any way to force nomad reload the config once the script ended creating this files?

j4ckzh0u commented 2 years ago

I've been out of the office few days, but now I'm here again :D

I copied the pattern from HELM provided by cilium in order to install the latest version of CNI when the container starts:

    task "cilium-setup" {
      lifecycle {
        hook = "poststart"
        sidecar = false
      }

      driver = "docker"
      config {
        security_opt = [
          "no-new-privileges"
        ]

        privileged = true
        userns_mode = "host"
        pids_limit = 200
        image = "cilium:1.9.12-RELEASE"
        entrypoint = ["sh"]

        args = [
           "/cni-install.sh"
        ]

        volumes = [
            "/etc/cni/net.d:/host/etc/cni/net.d",
            "/opt/cni/bin:/host/opt/cni/bin"
          ]
      }
      resources {
        memory = 20
      }
    }

This will add a config and he cni binary inside the host filesystem, so my question is if there are any way to force nomad reload the config once the script ended creating this files?

@monwolf does it work?

Thx.

monwolf commented 2 years ago

Hey @j4ckzh0u

This task copy the config and the cni binary inside the host from the cilium container. But nomad still needs to be restarted in order to refresh the fingerprinting.

Regards,

j4ckzh0u commented 2 years ago

@monwolf okey, Thx !

lgfa29 commented 2 years ago

is if there are any way to force nomad reload the config once the script ended creating this files?

@monwolf I don't think you will be able to reload the Nomad config from a task. You can try sending a SIGHUP signal to the Nomad agent process, but I'm not sure if it will reload the CNI config.

It's probably worth a quick try though. Run the job and manually signal the Nomad agent and see if it works.

pruiz commented 2 years ago

Hi,

I have a working setup with cilium-agent & cilium-cni working on top of nomad, and I can deploy services/tasks on top of cni/cilium, however I still need to solve howto assign labels to endpoints during creation. As currently endpoints get stuck with reserved:init label, and no assigned identity. Not sure if cilium-operator nay help in this.. I'll update when I have some more progress.

Regards

pruiz commented 2 years ago

Ok, my problem was quite easy: as there is no k8s to query for labels, cilium-agent set 'reserved:init' as unique/default label for newly created endpoints. Thus making all traffic for such endpoints denied by default.

However if I manually add any desired labels to the endpoint, and then (manually again) remove reserved:init, everything starts working smooth. Now I just need to think of the best way to provision labels automatically, probably from a wrapper script around cilium-cni command.

issue-account commented 1 year ago

@pruiz Were you able to solve the problem?

pruiz commented 1 year ago

@Hanmask21 yes, there are two options:

1) If you use cilium's docker-plugin, all labels assigned to task will get automatically handled by cilium, so the endpoint transitions from reserved:init automatically. 2) For CNI plugin, right now I've made a wrapper script against cilium-cni driver, which handles querying of labels to nomad and assings them to container on startup. What I have not is not really polished, I still need to work on it and publish it (along with sample nomad job file) at some git repo, but I am busy for a couple of weeks with other stuff.

In the mean time, a pseudo-wrapper script for cni would look like:

#!/bin/bash

CMDNAME=$(basename $0)

# Declare required nomad client's env vars..
export NOMAD_ADDR=https://localhost:4646
export NOMAD_CACERT=/etc/nomad.d/tls/nomad-ca.pem
export NOMAD_CLIENT_CERT=/etc/nomad.d/tls/nomad-ixn-client.pem
export NOMAD_CLIENT_KEY=/etc/nomad.d/tls/nomad-ixn-client-key.pem

NOMAD_NAMESPACE=$(nomad alloc status -json "$CNI_CONTAINERID" | jq -r .Namespace)
NOMAD_POD_NAME=$(nomad alloc status -json "$CNI_CONTAINERID" | jq -r .Name)

CNI_ARGS_EXTRA="${CNI_ARGS:-}"
CNI_ARGS="K8S_POD_NAMESPACE=$NOMAD_NAMESPACE;K8S_POD_NAME=$NOMAD_POD_NAME;K8S_POD_INFRA_CONTAINER_ID=nomad_init_$CNI_CONTAINERID"
[ -n "$CNI_ARGS_EXTRA" ] CNI_ARGS="$CNI_ARGS;$CNI_ARGS_EXTRA"

exec /opt/cni/bin/cilium-cni "$@"

I am deploying this script to /opt/cni/bin/cilium4nomad, and using this same name as plugin's "type" on /etc/cni/net.d/cilium.conflist file.

Regards

PD: In the future this logic maybe added to cilium-cni, or let the cilium-operator take care of requesting labels for containers by querying nomad server's directly.. but right now this just works.

siennathesane commented 1 year ago

Thanks @pruiz and @monwolf for working through this!

robloxrob commented 1 year ago

Awesome work!

issue-account commented 1 year ago

@pruiz "For CNI plugin, right now I've made a wrapper script against cilium-cni driver, which handles querying of labels to nomad and assings them to container on startup. What I have not is not really polished, I still need to work on it and publish it (along with sample nomad job file) at some git repo, but I am busy for a couple of weeks with other stuff."

Did you manage to finish your idea?

brotherdust commented 1 year ago

@pruiz

"For CNI plugin, right now I've made a wrapper script against cilium-cni driver, which handles querying of labels to nomad and assings them to container on startup. What I have not is not really polished, I still need to work on it and publish it (along with sample nomad job file) at some git repo, but I am busy for a couple of weeks with other stuff."

Did you manage to finish your idea?

Would like to know as well. I'm happy to help document the solution if you need help.

betterthanbreakfast commented 1 year ago

Would like to know as well.

I'am also interested.

brotherdust commented 1 year ago

@pruiz , I would seriously pay you money right now to know how you did this. I've been bashing my head against the wall for a week now.

pruiz commented 1 year ago

Hi,

Regarding this I was awaiting for a conclusion at https://github.com/hashicorp/nomad/issues/13824, as in order to really being able to integrate cilium with nomad, including being able to mix and match cilium and consul connect, a mechanism to extend nomad's default bridge configuration was needed. But in the end it looks like there wont be such a mechanism, so sadly I see little sense to keep investing into this.

Regards Pablo

deepankarsharma commented 1 year ago

We are trying to get Cilium working with Nomad as well at my workplace and would greatly appreciate enabling features that make this possible.

TimMensch commented 1 year ago

I find it disheartening that the page that this is based on:

https://docs.cilium.io/en/v1.9/gettingstarted/docker/

...no longer exists in the current docs, nor does the Docker example it references exist any longer.

All of the Cilium docs I'm finding seem to indicate it requires Kubernetes.

It does seem that Nomad Cilium support is aspirational rather than a reality. 😞

mjohnson9 commented 1 year ago

There was a relatively recent post on the Cilium blog about someone using Cilium with Nomad[^1]. Unfortunately, it seems that they have not yet released their work[^2].

Dan and the team at Cosmonic wrote a custom operator, integrated with Cilium and capable of working with Nomad - they are looking at open sourcing it in the future.

[^1]: Cosmonic User Story -- Cilium Blog, Jan 11 2023 [^2]: Cosmonic, Repository List

Cronocide commented 1 year ago

@protochron Can you offer any insight that might help us along this path? A working nomad-cilium integration would be a game changer for the Nomad community.

suikast42 commented 1 year ago

Here is a talk about cillium integration https://www.youtube.com/watch?v=DSkf9Y06-lE&ab_channel=HashiCorp

xmulligan commented 1 year ago

And a blog about it too https://cosmonic.com/blog/engineering/netreap-bringing-cilium-to-the-world-outside-kubernetes

protochron commented 1 year ago

Here's a more in-depth guide on how we run Cilium with Nomad: https://cosmonic.com/blog/engineering/netreap-a-practical-guide-to-running-cilium-in-nomad. We also just open sourced a tool that you can run as a system job in Nomad to sync Cilium policies, endpoint metadata and make sure that your IP allocations are cleaned up: https://github.com/cosmonic/netreap

robloxrob commented 1 year ago

Thank you so much for sharing this! We all appreciate your work on this and the contribution to the community.

suikast42 commented 1 year ago

Awesome. Thanks for contribution.

brotherdust commented 1 year ago

Here's a more in-depth guide on how we run Cilium with Nomad: https://cosmonic.com/blog/engineering/netreap-a-practical-guide-to-running-cilium-in-nomad. We also just open sourced a tool that you can run as a system job in Nomad to sync Cilium policies, endpoint metadata and make sure that your IP allocations are cleaned up: https://github.com/cosmonic/netreap

Oh man! It's like Christmas early! Thanks!

issue-account commented 1 year ago

The container does not resolve addresses via DNS: curl: (6) Failed to resolve host: example.service.consul But if you check the service in the consul, then it is registered in it image

The ciliim parameters are as follows: command: cilium-agent --kvstore consul --kvstore-opt consul.address=127.0.0.1:8500 -t geneve --enable-ipv6=false --prometheus-serve-addr="127.0.0.1:9962" --enable-l7-proxy=false

The policy config:

[
  {
    "labels": [
      {
        "key": "io.cosmonic.cilium_health"
      }
    ],
    "endpointSelector": {
      "matchLabels": {
        "reserved:health": ""
      }
    },
    "ingress": [
      {
        "fromEntities": ["remote-node", "host"]
      }
    ],
    "egress": [
      {
        "fromEntities": ["remote-node", "host"]
      }
    ]
  },
  {
    "endpointSelector": {},
    "labels": [
      {
        "key": "io.cosmonic.default_rule"
      }
    ],
    "ingress": [
      {
        "fromCIDRSet": [
          {
            "cidr": "0.0.0.0/0"
          }
        ]
      },
      {
        "fromEntities": ["host", "remote-node"]
      }
    ],
    "egress": [
      {
        "toEntity": ["host"],
        "toPorts": [
          {
            "ports": [
              {
                "port": "53",
                "protocol": "ANY"
              },
              {
                "port": "8600",
                "protocol": "ANY"
              }
            ]
          }
        ]
      },
      {
        "toCIDRSet": [
          {
            "cidr": "0.0.0.0/0"
          }
        ]
      }
    ]
  }
]
issue-account commented 1 year ago

@protochron @mjohnson9 can you help? https://github.com/hashicorp/nomad/issues/12120#issuecomment-1562918786

mjohnson9 commented 1 year ago

protochron @mjohnson9 can you help? #12120 (comment)

I think you may have me confused with someone else. I’m afraid I don’t have the knowledge to help, as much as I’d like to.

lgfa29 commented 1 year ago

Hi @Hanmask21 πŸ‘‹

I would suggest you open an issue on the Netreap repo. Folks in this thread are unlikely to be able to help you.