hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.94k stars 1.96k forks source link

Inconsistent job placement response with constraints on system jobs #19413

Open mr-karan opened 11 months ago

mr-karan commented 11 months ago

Nomad version

1.6.1

Operating system and Environment details

AWS EC2, Ubuntu 22.04

Issue

When deploying a system type job with a specific constraint filter, the job appears to be placed correctly on all targeted nodes. However, the CLI and UI indicate a placement failure. This behavior is inconsistent and misleading, as the job is actually being applied as expected.

image

image

This issue leads to confusion and potential misinterpretation of the job's deployment status. It also affects automation workflows that rely on CLI exit codes.

Reproduction steps

  1. Define a system type job with a constraint filter.
  2. Deploy the job.
  3. Observe that, despite successful placement on all filtered nodes, the CLI exits with a non-zero code and the UI shows a placement failure.

Expected Result

The expectation for a system type job with a constraint filter is to have it placed on all nodes that meet the constraint criteria without any errors or misleading failure messages.

Actual Result

The job gets placed on the appropriate nodes, but the CLI and UI report a placement failure, which is not accurate.

Job file (if appropriate)

Simplified example:

job "my-app" {
  type = "system"

  group "my-app" {
    count = 1

    task "my-app" {
      driver = "exec"

      artifact {
        source = "s3::https://s3-ap-south-1.amazonaws.com/app/path/binary.tar.gz"
      }

      # Run the program.
      config {
        command = "$${NOMAD_TASK_DIR}/app.bin"
      }

      constraint {
        attribute = "${meta.ec2_nomad_client}"
        operator  = "="
        value     = "my-app"
      }
    }
  }
}

The meta.ec2_nomad_client exists in all client nodes, but they have different values. We use this tag to determine which job gets placed in which node.

tgross commented 10 months ago

Hi @mr-karan! This has been a bit of a persistent and hard to track down issue for a while. https://github.com/hashicorp/nomad/issues/13455 is related, and https://github.com/hashicorp/nomad/issues/12366 https://github.com/hashicorp/nomad/issues/12748 https://github.com/hashicorp/nomad/issues/12016 may be as well.

I'm not going to mark this as a duplicate but I'll try to nudge fixing this one along in roadmapping.