hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.82k stars 1.95k forks source link

Display task events on console while job run #18271

Open EugenKon opened 1 year ago

EugenKon commented 1 year ago

Nomad version

$ nomad --version
Nomad v1.6.1
BuildDate 2023-07-21T13:49:42Z
Revision 515895c7690cdc72278018dc5dc58aca41204ccc

Operating system and Environment details

$ uname -a
Darwin Eugens-MacBook-Pro.local 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:22 PDT 2023; root:xnu-8796.121.3~7/RELEASE_X86_64 x86_64 i386 Darwin

Issue

image

Reproduction steps

configure task with wrong auth section like this:

...
    task "redis" {
      driver = "docker"

      config {
        # image = "<private-registry-url>/<image-name>:<tag>"
        image = "redis:7.2"
        ports = ["db"]

        auth {
          server_address = ""
          username = "dockerhub_user"
          password = "dockerhub_password"
        }
      }

Expected Result

It would be nice to see errors in console similar to UI

Actual Result

It is not clear what was happened:

$ nomad job run -check-index 0 wi-redis.nomad.hcl
==> 2023-08-21T15:39:03-04:00: Monitoring evaluation "408d706e"
    2023-08-21T15:39:03-04:00: Evaluation triggered by job "nomad-redis"
    2023-08-21T15:39:03-04:00: Evaluation within deployment: "5d9c5a5c"
    2023-08-21T15:39:03-04:00: Allocation "7d3598d6" created: node "4a37d9e8", group "cache"
    2023-08-21T15:39:03-04:00: Evaluation status changed: "pending" -> "complete"
==> 2023-08-21T15:39:03-04:00: Evaluation "408d706e" finished with status "complete"
==> 2023-08-21T15:39:03-04:00: Monitoring deployment "5d9c5a5c"
  ⠦ Deployment "5d9c5a5c" failed

    2023-08-21T15:44:03-04:00
    ID          = 5d9c5a5c
    Job ID      = nomad-redis
    Job Version = 0
    Status      = failed
    Description = Failed due to progress deadline - no stable job version to auto revert to

    Deployed
    Task Group  Auto Revert  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    cache       true         1        3       0        3          2023-08-21T19:44:03Z

Job file (if appropriate)

job "nomad-redis" {
  datacenters = ["dc1"]
  type = "service"

  group "cache" {
    count = 1

    network {
      # mode = "host"
      port "db" {
        to = 6379
      }
    }

    update {
      max_parallel = 1
      min_healthy_time  = "10s"
      healthy_deadline  = "2m"
      progress_deadline = "5m"
      auto_revert = true
      auto_promote = true
      canary = 1
    }

    task "redis" {
      driver = "docker"

      # https://developer.hashicorp.com/nomad/docs/drivers/docker
      config {
        # image = "<private-registry-url>/<image-name>:<tag>"
        image = "redis:7.2"
        ports = ["db"]
        # network_mode = "host"

        # https://developer.hashicorp.com/nomad/docs/drivers/docker#authentication
        # https://developer.hashicorp.com/nomad/docs/drivers/docker#client-requirements
        auth {
          server_address = ""
          username = "dockerhub_user"
          password = "dockerhub_password"
        }
      }

      service {
        tags = ["redis-service"]
        name = "redis"
        port = "db"

        provider = "consul"
        # connect = ??

        #name (string: "<job>-<taskgroup>-<task>")

        meta {
          meta = "Some meta for my service"
        }

        check {
          name     = "host-redis-check"
          type     = "tcp"
          port     = "db"
          interval = "10s"
          timeout  = "2s"
        }

        check {
          name     = "app_health"
          type     = "http"
          path     = "/health"
          interval = "20s"
          timeout  = "5s"

          check_restart {
            limit = 3
            grace = "90s"
            ignore_warnings = false
          }
        }
      }
    }
  }
}

Logs

See screenshot above

lgfa29 commented 1 year ago

Thanks for the suggestion @EugenKon.

Those entries are called task events and they do hold very valuable information that would be helpful for the CLI to display back to users.

I have placed this suggestion for further roadmapping.