hashicorp / levant

An open source templating and deployment tool for HashiCorp Nomad jobs
Mozilla Public License 2.0
825 stars 125 forks source link

Unable to define namespace within job template #427

Open mabunixda opened 2 years ago

mabunixda commented 2 years ago

Description

When deploying a job template with levant, the namespace argument gets ignored and the job is always placed in default namespace. When using a rendered job template and applying this output to nomad the namespace gets handled as expected.

$ levant plan -var-file levant.yaml --log-level=DEBUG automation/zigbee2mqtt.hcl
2021-09-28T09:36:07Z |DEBU| template/render: variable file extension .yaml detected
2021-09-28T09:36:07Z |DEBU| template/render: no command line variables passed
2021-09-28T09:36:07Z |DEBU| template/funcs: renaming "replace" function to "levantReplace"
2021-09-28T09:36:07Z |DEBU| template/funcs: renaming "add" function to "levantAdd"
2021-09-28T09:36:07Z |DEBU| template/funcs: renaming "env" function to "levantEnv"
2021-09-28T09:36:07Z |INFO| helper/variable: using variable with key nomad and value map[datacenter:home] from file
2021-09-28T09:36:07Z |INFO| helper/variable: using variable with key common and value map[dirs:map[shared:/srv/] env:map[pgid:1000 puid:1000]] from file
2021-09-28T09:36:07Z |DEBU| levant/plan: triggering Nomad plan
2021-09-28T09:36:07Z |INFO| levant/plan: job is a new addition to the cluster

levant would create a new job in the cluster, instead, the same configuration/plan dumped into a file and using nomad plan shows no changes - as expected.

$ levant render -var-file levant.yaml --log-level=FATAL automation/zigbee2mqtt.hcl > tmp.hcl
$ nomad plan tmp.hcl 
Job: "zigbee2mqtt"
Task Group: "homeautomation" (1 ignore)
  Task: "zigbee2mqtt"

Scheduler dry-run:
- All tasks successfully allocated.

I added some output within the plan method

log.Info().Msg(fmt.Sprintf("Job namespace: %v", *lp.config.Template.Job.Namespace))
// Run a plan using the rendered job.
resp, _, err := lp.nomad.Jobs().Plan(lp.config.Template.Job, true, nil)
log.Info().Msg(fmt.Sprintf("%#v", resp.Diff))

And the output shows that the namespace is set in the job object which is applied to the nomad plan method:

2021-09-28T10:06:52Z |DEBU| levant/plan: triggering Nomad plan
2021-09-28T10:06:52Z |INFO| Job namespace: automation
2021-09-28T10:06:52Z |INFO| &api.JobDiff{Type:"Added", ID:"zigbee2mqtt", Fields:[]*api.FieldDiff{(*api.FieldDiff)(0xc0000b4ea0), (*api.FieldDiff)(0xc0000b4f00), (*api.FieldDiff)(0xc0000b4f60), (*api.FieldDiff)(0xc0000b4fc0), (*api.FieldDiff)(0xc0000b5020), (*api.FieldDiff)(0xc0000b5080), (*api.FieldDiff)(0xc0000b50e0), (*api.FieldDiff)(0xc0000b5140)}, Objects:[]*api.ObjectDiff{(*api.ObjectDiff)(0xc0000c9450), (*api.ObjectDiff)(0xc0000c94a0)}, TaskGroups:[]*api.TaskGroupDiff{(*api.TaskGroupDiff)(0xc0001dd3b0)}}
2021-09-28T10:06:52Z |INFO| levant/plan: job is a new addition to the cluster

After some research within the nomad project the following workaround came along which can be used as workaround but I would expect that when the namespace parameter is used within the plan description on file, this will also be applied onto nomad

NOMAD_NAMESPACE=automation levant plan -var-file levant.yaml --log-level=DEBUG automation/zigbee2mqtt.hcl
2021-09-28T10:32:25Z |DEBU| template/render: variable file extension .yaml detected
2021-09-28T10:32:25Z |DEBU| template/render: no command line variables passed
2021-09-28T10:32:25Z |DEBU| template/funcs: renaming "env" function to "levantEnv"
2021-09-28T10:32:25Z |DEBU| template/funcs: renaming "add" function to "levantAdd"
2021-09-28T10:32:25Z |DEBU| template/funcs: renaming "replace" function to "levantReplace"
2021-09-28T10:32:25Z |INFO| helper/variable: using variable with key app and value map[guacamole:map[db:map[hostname:172.16.20.206 name:guac password:guac user:guac] traefik:map[domain:guacamole.home.nitram.at] volumes:map[config:/srv/guacamole/]]] from file
2021-09-28T10:32:25Z |INFO| helper/variable: using variable with key nomad and value map[datacenter:home] from file
2021-09-28T10:32:25Z |INFO| helper/variable: using variable with key common and value map[dirs:map[shared:/srv/] env:map[pgid:1000 puid:1000]] from file
2021-09-28T10:32:25Z |DEBU| levant/plan: triggering Nomad plan
2021-09-28T10:32:25Z |INFO| levant/plan: no changes detected for job
2021-09-28T10:32:25Z |INFO| levant/plan: no changes found in job

Relevant Nomad job specification file

job "zigbee2mqtt" {
  datacenters = ["[[ .nomad.datacenter ]]"]
  type        = "service"
  namespace   = "automation"

  constraint {
    attribute = "${node.unique.name}"
    value     = "zigbee"
  }

  update {
    max_parallel = 1
    min_healthy_time = "10s"
    healthy_deadline = "3m"
    auto_revert = false
    canary = 0
  }

  group "homeautomation" {
    count = 1
    task "zigbee2mqtt" {
      driver = "docker"
      config {
        image = "docker.io/koenkk/zigbee2mqtt"
        volumes = [
          "[[ .common.dirs.shared ]]zigbee2mqtt:/app/data",
          "/run/udev:/run/udev",
        ]
        devices = [
            {
                host_path = "/dev/ttyACM0"
                container_path = "/dev/ttyACM0"
            }
        ]
      }
      service {
        tags = ["logging"]
        check {
          type = "script"
          name = "zigbee_nodejs"
          command = "pgrep"
          args = ["node"]
          interval  = "60s"
          timeout   = "5s"

          check_restart {
              limit = 3
              grace = "90s"
              ignore_warnings = false
          }
        }
      }
      resources {
        cpu    = 128
        memory = 192
      }
    }
  }
}

Output of levant version:

$ levant version
Levant v0.3.0

Output of consul version:

$ consul version
Consul v1.10.1
Revision db839f18b
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

Output of nomad version:

$ nomad version
Nomad v1.1.4 (acd3d7889328ad1df2895eb714e2cbe3dd9c6d82)

Additional environment details:

tmp.hcl used on nomad plan:

$ cat tmp.hcl 
job "zigbee2mqtt" {
  datacenters = ["home"]
  type        = "service"
  namespace   = "automation"

  constraint {
    attribute = "${node.unique.name}"
    value     = "zigbee"
  }

  update {
    max_parallel = 1
    min_healthy_time = "10s"
    healthy_deadline = "3m"
    auto_revert = false
    canary = 0
  }

  group "homeautomation" {
    count = 1
    task "zigbee2mqtt" {
      driver = "docker"
      config {
        image = "docker.io/koenkk/zigbee2mqtt"
        volumes = [
          "/srv/zigbee2mqtt:/app/data",
          "/run/udev:/run/udev",
        ]
        devices = [
            {
                host_path = "/dev/ttyACM0"
                container_path = "/dev/ttyACM0"
            }
        ]
      }
      service {
        tags = ["logging"]
        check {
          type = "script"
          name = "zigbee_nodejs"
          command = "pgrep"
          args = ["node"]
          interval  = "60s"
          timeout   = "5s"

          check_restart {
              limit = 3
              grace = "90s"
              ignore_warnings = false
          }
        }
      }
      resources {
        cpu    = 128
        memory = 192
      }
    }
  }
}
Fuco1 commented 1 year ago

Any ETA on this? Just got bitten by this as we deploy with two different methods (don't ask) and the levant pipelines have been adding the task to the default namespace, resulting in two back-end services which traefik was picking up and serving "randomly", resulting in two users seeing two different versions :O

mabunixda commented 1 year ago

I have no clue what's going on within nomad ecosystem ... all integrations by HashiCorp ( levant, hashi-pack ) are not moving in any kind of direction :(

When will there a new version @hc-github-team-nomad-ecosystem ? or even some kind of roadmap?

mikenomitch commented 1 year ago

Hey @mabunixda, sorry about this. To be totally honest, we stretched ourselves thin over the last couple releases and ended up having to push of Pack work out further than we would have liked.

We plan to eventually move Levant users over to Pack and provide guides (and maybe tooling?) to make this easy, but since we had to push out Pack work, things have been sitting in an admittedly awkward state for too long.

The good news is, we're in a better spot capacity-wise and have people working on Pack now. The high level plan is to get Pack into "beta", provide a good path to move Levant users over, then once we feel like providing a strictly better experience on Pack, we'll call it GA. The Pack beta plan is on the new Nomad Public Roadmap.

mabunixda commented 1 year ago

So levant is becoming deprecated and the way to deploy workload on nomad is hashi-pack?

mikenomitch commented 1 year ago

That is the plan eventually, but still a work in progress.