evandam commented 2 years ago

Proposal

Since the Nomad CLI supports monitoring deployments with nomad job run, it would be great to see similar behavior with nomad job dispatch.

Ideally it would dispatch the job and poll the dispatched job while it is running, and then exit 0/non-zero based on the exit status of the job (maybe some consideration with idempotency tokens needed).

Use-cases

Use case is mostly around a CI environment where we use dispatched jobs to prepare an environment, seed a database, etc.

Being able to shell out to the Nomad CLI rather than custom scripting with the Nomad API would be great.

Attempted Solutions

Current solution is pretty clunky, unless there's a better way to do it.

Dispatch a parameterized job
Get the dispatched job ID from the previous command
Poll the Nomad API until the dispatched job is dead
Check the latest alloc of the dispatched job to see if it completed successfully or failed.
- /v1/job/<dispatched_job_id> will only say "dead" without indication of success/failure. Gets a bit tricky when the dispatched job has rescheduling, lost nodes, etc. so logic gets a little dicey.

tgross commented 2 years ago

Thanks for opening this issue @evandam! That makes sense to me, so we'll look into getting it on the roadmap.

lgfa29 commented 1 year ago

3753 also mentions tailing log output, which could be a useful feature as well.

lgfa29 commented 1 year ago

16898 also describes a use case for batch jobs with `nomad job run` command.

tgross commented 3 months ago

Adding some context for an internal discussion we're having around this issue. Currently we have monitoring on the nomad job dispatch command, which monitors the creation of the evaluation and allocations. This looks something like this:

jobspec

```hcl job "example" { type = "batch" parameterized { payload = "required" } group "group" { task "task" { driver = "docker" config { image = "busybox:1" command = "/bin/sh" args = ["-c", "cat local/payload.txt; sleep 300"] } dispatch_payload { file = "payload.txt" } resources { cpu = 64 memory = 128 } } } } ```

$ echo 'foo' | nomad job dispatch example -
Dispatched Job ID = example/dispatch-1723223743-25a6aca7
Evaluation ID     = 00045600

==> 2024-08-09T13:15:43-04:00: Monitoring evaluation "00045600"
    2024-08-09T13:15:43-04:00: Evaluation triggered by job "example/dispatch-1723223743-25a6aca7"
    2024-08-09T13:15:44-04:00: Allocation "647f8c26" created: node "9308bb31", group "group"
    2024-08-09T13:15:44-04:00: Evaluation status changed: "pending" -> "complete"
==> 2024-08-09T13:15:44-04:00: Evaluation "00045600" finished with status "complete"

So effectively here the command hits the Dispatch Job API and then polls the Read Evaluation API (typically only once) to get the allocations. This is less complicated than monitoring a service deployment because we don't have to hit the Deployment API.

What this issue would hypothetically add is hitting the List Job Allocations API and then presenting the task states in some sensible way. As noted in https://github.com/hashicorp/nomad/issues/11601#issuecomment-1416432567 we could also hit the Stream Logs API but presenting that in a sensible way is a little harder given you may have multiple allocations.

hashicorp / nomad

Nomad CLI Monitor Dispatched Jobs #11601

Proposal

Use-cases

Attempted Solutions

3753 also mentions tailing log output, which could be a useful feature as well.

16898 also describes a use case for batch jobs with `nomad job run` command.

hashicorp / nomad

Nomad CLI Monitor Dispatched Jobs #11601

Proposal

Use-cases

Attempted Solutions

3753 also mentions tailing log output, which could be a useful feature as well.

16898 also describes a use case for batch jobs with nomad job run command.

16898 also describes a use case for batch jobs with `nomad job run` command.