hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.87k stars 1.95k forks source link

Troubleshooting blocked evaluation #19827

Closed suikast42 closed 8 months ago

suikast42 commented 8 months ago

I had a similar issue at the past but donΓ„T undersnatd why my evalatation is blocked.

See https://github.com/hashicorp/nomad/issues/19446

Now I can reproduce the issue.

I have deployed a mssql DB with a static port mapping. Then I try accidently deloy a second job with the same static port mapping with only one worker node.

That's not a bug that nomad deny the allocation. But the information why the alocation is blocked is nowhere listed.

image

{
  "priority": 50,
  "type": "service",
  "triggeredBy": "job-register",
  "status": "complete",
  "statusDescription": null,
  "failedTGAllocs": [
    {
      "Name": "debezium_server_assan",
      "CoalescedFailures": 0,
      "NodesEvaluated": 1,
      "NodesExhausted": 0,
      "NodesAvailable": {
        "nomadder1": 1
      },
      "ClassFiltered": null,
      "ConstraintFiltered": null,
      "ClassExhausted": null,
      "DimensionExhausted": null,
      "QuotaExhausted": null,
      "Scores": null
    }
  ],
  "previousEval": null,
  "nextEval": null,
  "blockedEval": "4e0cfef8-208a-16c3-f648-f0bdb775b5f9",
  "modifyIndex": 43323,
  "modifyTime": "2024-01-26T08:29:33.664Z",
  "createIndex": 43320,
  "createTime": "2024-01-26T08:29:33.653Z",
  "waitUntil": null,
  "namespace": "default",
  "plainJobId": "assan_cdc",
  "relatedEvals": [
    "4e0cfef8-208a-16c3-f648-f0bdb775b5f9"
  ],
  "job": "[\"assan_cdc\",\"default\"]",
  "node": null
}
{
  "priority": 50,
  "type": "service",
  "triggeredBy": "queued-allocs",
  "status": "blocked",
  "statusDescription": "created to place remaining allocations",
  "failedTGAllocs": [
    {
      "Name": "debezium_server_assan",
      "CoalescedFailures": 0,
      "NodesEvaluated": 1,
      "NodesExhausted": 0,
      "NodesAvailable": {
        "nomadder1": 1
      },
      "ClassFiltered": null,
      "ConstraintFiltered": null,
      "ClassExhausted": null,
      "DimensionExhausted": null,
      "QuotaExhausted": null,
      "Scores": null
    }
  ],
  "previousEval": "2ba2855b-629d-e743-eff3-71fb17b479b4",
  "nextEval": null,
  "blockedEval": null,
  "modifyIndex": 43321,
  "modifyTime": "2024-01-26T08:29:33.657Z",
  "createIndex": 43321,
  "createTime": "2024-01-26T08:29:33.657Z",
  "waitUntil": null,
  "namespace": "default",
  "plainJobId": "assan_cdc",
  "relatedEvals": [
    "2ba2855b-629d-e743-eff3-71fb17b479b4"
  ],
  "job": "[\"assan_cdc\",\"default\"]",
  "node": null
}

nomad job status

ID            = assan_cdc
Name          = assan_cdc
Submit Date   = 2024-01-26T08:29:33Z
Type          = service
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = pending
Periodic      = false
Parameterized = false

Summary
Task Group             Queued  Starting  Running  Failed  Complete  Lost  Unknown
debezium_server_assan  1       0         0        0       0         0     0

Placement Failure
Task Group "debezium_server_assan":

Latest Deployment
ID          = 292a527f
Status      = running
Description = Deployment is running

Deployed
Task Group             Desired  Placed  Healthy  Unhealthy  Progress Deadline
debezium_server_assan  1        0       0        0          N/A

Allocations
No allocations placed

deployment status 292a527f

ID          = 292a527f
Job ID      = assan_cdc
Job Version = 0
Status      = running
Description = Deployment is running

Deployed
Task Group             Desired  Placed  Healthy  Unhealthy  Progress Deadline
debezium_server_assan  1        0       0        0          N/A

An information like 'not enough cpu, mem' or 'port conflict and no more nodes avlialabe' cloud be very handy for trouble shooting

lgfa29 commented 8 months ago

Hi @suikast42 πŸ‘‹

Which version of Nomad are you running? I just tested on Nomad 1.7.3 and I do get the expected results on port collision:

image image image
suikast42 commented 8 months ago

This is strange:

nomad --version Nomad v1.7.3 BuildDate 2024-01-15T16:55:40Z Revision 60ee328f97d19d2d2d9761251b895b06d82eb1a1

suikast42 commented 8 months ago

Ok I try it with a simple deployment

job "whoami" {

  group "whoami" {
    count = 1

    network {
      mode = "bridge"
      port "web" {
        to=8080
        static = 8080
      }
    }

    service {
      name = "${NOMAD_NAMESPACE}-${NOMAD_GROUP_NAME}"
      port = "web"

      tags = [
        "traefik.enable=true",
        "traefik.http.routers.${NOMAD_GROUP_NAME}-${NOMAD_ALLOC_ID}.rule=Host(`${NOMAD_NAMESPACE}.${NOMAD_GROUP_NAME}.cloud.private`)",
        "traefik.http.routers.${NOMAD_GROUP_NAME}-${NOMAD_ALLOC_ID}.tls=true",
      ]

      check {
        type     = "http"
        path     = "/health"
        port     = "web"
        interval = "10s"
        timeout  = "2s"
      }
    }

    task "whoami" {
      driver = "docker"
#      driver = "containerd-driver"
      config {
        image = "traefik/whoami"
        ports = ["web"]
        args  = ["--port", "${NOMAD_PORT_web}"]
      }

      resources {
        cpu    = 100
        memory = 128
      }
    }
  }
}

The second time I deloy the same job with the name whoami2 and let the rest of the definition the same

The result image

image

 nomad job status whoami2
ID            = whoami2
Name          = whoami2
Submit Date   = 2024-02-08T09:28:05Z
Type          = service
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = pending
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
whoami      1       0         0        0       0         0     0

Placement Failure
Task Group "whoami":

Latest Deployment
ID          = 70630165
Status      = running
Description = Deployment is running

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
whoami      1        0       0        0          N/A

Allocations
No allocations placed
 nomad eval  list
ID        Priority  Triggered By        Job ID         Namespace  Node ID   Status    Placement Failures
96fd4b1b  50        queued-allocs       whoami2        default    <none>    blocked   N/A - In Progress
ca193e9c  50        job-register        whoami2        default    <none>    complete  true
nomad eval status 96fd4b1b
ID                 = 96fd4b1b
Create Time        = 4m45s ago
Modify Time        = 4m45s ago
Status             = blocked
Status Description = created to place remaining allocations
Type               = service
TriggeredBy        = queued-allocs
Job ID             = whoami2
Namespace          = default
Priority           = 50
Placement Failures = N/A - In Progress

Failed Placements
Task Group "whoami" (failed to place 1 allocation):
nomad eval status ca193e9c
ID                 = ca193e9c
Create Time        = 5m59s ago
Modify Time        = 5m59s ago
Status             = complete
Status Description = complete
Type               = service
TriggeredBy        = job-register
Job ID             = whoami2
Namespace          = default
Priority           = 50
Placement Failures = true

Failed Placements
Task Group "whoami" (failed to place 1 allocation):

Evaluation "96fd4b1b" waiting for additional capacity to place remainder
lgfa29 commented 8 months ago

Hum...sorry I still can't reproduce the problem πŸ€”

How many clients do you have? Could you share the full output of when you run nomad job run for the second time?

suikast42 commented 8 months ago

Have one worker an one master

2024-02-09 12:11:08.264 
[nomad.service πŸ’» master-01] [🐞] []  nomad.job.service_sched.binpack: preemption not possible : eval_id=25b134d9-d664-2128-a0d9-6bf682a66539 job_id=whoami2 namespace=default network_resource="&{bridge     0 <nil> [{web 42000 42000 default}] []}"     

2024-02-09 12:11:08.264 
[nomad.service πŸ’» master-01] [🐞] []  nomad.job.service_sched: failed to place all allocations, blocked eval created: eval_id=25b134d9-d664-2128-a0d9-6bf682a66539 job_id=whoami2 namespace=default blocked_eval_id=fb84782d-35c5-d3e6-2fb3-85404e7c1a98     
2024-02-09 12:11:08.264 
[nomad.service πŸ’» master-01] [🐞] []  nomad.job.service_sched: reconciled current state with desired state: eval_id=25b134d9-d664-2128-a0d9-6bf682a66539 job_id=whoami2 namespace=default     
2024-02-09 12:11:08.264 
[nomad.service πŸ’» master-01] [🐞] []  nomad.job.service_sched: setting eval status: eval_id=25b134d9-d664-2128-a0d9-6bf682a66539 job_id=whoami2 namespace=default status=complete     
2024-02-09 12:11:08.264 
[nomad.service πŸ’» master-01] [βœ…] []    | Desired Changes for "whoami2": (place 1) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 0) (canary 0)     
2024-02-09 12:11:08.265 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=POST path=/v1/job/whoami2/plan duration=2.186051ms     
2024-02-09 12:11:12.063 
[nomad.service πŸ’» master-01] [🐞] []  worker.service_sched.binpack: preemption not possible : eval_id=c4362777-c47e-a814-3dd3-9031a69144d8 job_id=whoami2 namespace=default worker_id=3ba79f80-9ba2-cdfe-ba09-9ce8fc0955e1 network_resource="&{bridge     0 <nil> [{web 42000 42000 default}] []}"     
2024-02-09 12:11:12.063 
[nomad.service πŸ’» master-01] [🐞] []  worker.service_sched: reconciled current state with desired state: eval_id=c4362777-c47e-a814-3dd3-9031a69144d8 job_id=whoami2 namespace=default worker_id=3ba79f80-9ba2-cdfe-ba09-9ce8fc0955e1     
2024-02-09 12:11:12.063 
[nomad.service πŸ’» master-01] [🐞] []  worker: dequeued evaluation: worker_id=3ba79f80-9ba2-cdfe-ba09-9ce8fc0955e1 eval_id=c4362777-c47e-a814-3dd3-9031a69144d8 type=service namespace=default job_id=whoami2 node_id="" triggered_by=job-register     
2024-02-09 12:11:12.064 
[nomad.service πŸ’» master-01] [βœ…] []    | Desired Changes for "whoami2": (place 1) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 0) (canary 0)     
2024-02-09 12:11:12.068 
[nomad.service πŸ’» master-01] [🐞] []  worker.service_sched: failed to place all allocations, blocked eval created: eval_id=c4362777-c47e-a814-3dd3-9031a69144d8 job_id=whoami2 namespace=default worker_id=3ba79f80-9ba2-cdfe-ba09-9ce8fc0955e1 blocked_eval_id=84b2b984-606f-5e1b-7ac9-0cfc0a88debe     
2024-02-09 12:11:12.068 
[nomad.service πŸ’» master-01] [🐞] []  worker: created evaluation: worker_id=3ba79f80-9ba2-cdfe-ba09-9ce8fc0955e1 eval="<Eval \"84b2b984-606f-5e1b-7ac9-0cfc0a88debe\" JobID: \"whoami2\" Namespace: \"default\">" waitUntil="\"0001-01-01 00:00:00 +0000 UTC\""     
2024-02-09 12:11:12.073 
[nomad.service πŸ’» master-01] [🐞] []  worker.service_sched: setting eval status: eval_id=c4362777-c47e-a814-3dd3-9031a69144d8 job_id=whoami2 namespace=default worker_id=3ba79f80-9ba2-cdfe-ba09-9ce8fc0955e1 status=complete     
2024-02-09 12:11:12.078 
[nomad.service πŸ’» master-01] [🐞] []  worker: ack evaluation: worker_id=3ba79f80-9ba2-cdfe-ba09-9ce8fc0955e1 eval_id=c4362777-c47e-a814-3dd3-9031a69144d8 type=service namespace=default job_id=whoami2 node_id="" triggered_by=job-register     
2024-02-09 12:11:12.078 
[nomad.service πŸ’» master-01] [🐞] []  worker: updated evaluation: worker_id=3ba79f80-9ba2-cdfe-ba09-9ce8fc0955e1 eval="<Eval \"c4362777-c47e-a814-3dd3-9031a69144d8\" JobID: \"whoami2\" Namespace: \"default\">"     
2024-02-09 12:11:12.090 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2 duration="475.382Β΅s"     
2024-02-09 12:11:12.106 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2/allocations duration="289.615Β΅s"     
2024-02-09 12:11:12.110 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2/evaluations duration="300.334Β΅s"     
2024-02-09 12:11:12.187 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2/deployment?index=1 duration="315.753Β΅s"     
2024-02-09 12:11:12.188 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2/summary?index=1 duration="337.219Β΅s"     
2024-02-09 12:11:12.190 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2/deployment duration="420.686Β΅s"     
2024-02-09 12:11:12.190 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path="/v1/vars?prefix=nomad%2Fjobs%2Fwhoami2" duration="310.217Β΅s"     
2024-02-09 12:11:12.197 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2/deployment duration="332.207Β΅s"     
2024-02-09 12:11:12.207 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2 duration="364.175Β΅s"     
2024-02-09 12:11:14.125 
[nomad.service πŸ’» master-01] [🐞] []  http: request complete: method=GET path=/v1/job/whoami2/deployment?index=58495 duration="296.495Β΅s"     
suikast42 commented 8 months ago

I try it with bridge and host network mode. Both the same.

suikast42 commented 8 months ago

My nomad and consul configs. Maybe that helps?

Consul server

datacenter = "nomadder1"
data_dir =  "/opt/services/core/consul/data"
log_level = "INFO"
node_name = "master-01"
server = true
bind_addr = "0.0.0.0"
advertise_addr = "172.42.1.10"
client_addr = "0.0.0.0"
encrypt = "G1CHAD7wwu0tU28BlKkirSahTJ/Tqpo9ClOAycQAUwE="
server_rejoin_age_max = "8640h"
# https://developer.hashicorp.com/consul/docs/connect/observability/ui-visualization
ui_config{
   enabled = true
   dashboard_url_templates {
       service = "https://grafana.cloud.private/d/lDlaj-NGz/service-overview?orgId=1&var-service={{Service.Name}}&var-namespace={{Service.Namespace}}&var-partition={{Service.Partition}}&var-dc={{Datacenter}}"
   }
   metrics_provider = "prometheus"
   metrics_proxy {
     base_url = "http://mimir.service.consul:9009/prometheus"

     add_headers = [
 #      {
 #         name = "Authorization"
 #         value = "Bearer <token>"
 #      }
       {
          name = "X-Scope-OrgID"
          value = "1"
       }
     ]
     path_allowlist = ["/prometheus/api/v1/query_range", "/prometheus/api/v1/query"]
   }
}
addresses {
  #  grpc = "127.0.0.1"
    grpc_tls = "127.0.0.1"
}
ports {
    http = -1
    https = 8501
   # grpc = 8502
    grpc_tls = 8503
}
connect {
     enabled = true
}
retry_join =  ["172.42.1.10"]

bootstrap_expect = 1

auto_encrypt{
    allow_tls = true
}
performance{
    raft_multiplier = 1
}

node_meta{
  node_type = "server"
}
tls{
    defaults {
        ca_file = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
        cert_file = "/etc/opt/certs/consul/consul.pem"
        key_file = "/etc/opt/certs/consul/consul-key.pem"
        verify_incoming = true
        verify_outgoing = true
    }
    internal_rpc {
        verify_server_hostname = true
    }
}
#watches = [
#  {
#    type = "checks"
#    handler = "/usr/bin/health-check-handler.sh"
#  }
#]

telemetry {
  disable_hostname = true
  prometheus_retention_time = "72h"
}

nomad server

log_level = "DEBUG"
name = "master-01"
datacenter = "nomadder1"
data_dir =  "/opt/services/core/nomad/data"

#You should only set this value to true on server agents
#if the terminated server will never join the cluster again
#leave_on_interrupt= false

#You should only set this value to true on server agents
#if the terminated server will never join the cluster again
#leave_on_terminate = false

server {
  enabled = true
  job_max_priority = 100 # 100 is the default
  job_default_priority = 50 # 50 is the default
  bootstrap_expect =  1
  encrypt = "4PRfoE6Mj9dHTLpnzmYD1+THdlyAo2Ji4U6ewMumpAw="
  rejoin_after_leave = true
  server_join {
    retry_join =  ["172.42.1.10"]
    retry_max = 0
    retry_interval = "15s"
  }
}

bind_addr = "0.0.0.0" # the default
advertise {
  # Defaults to the first private IP address.
  http = "172.42.1.10"
  rpc  = "172.42.1.10"
  serf = "172.42.1.10"
}

tls {
  http = true
  rpc  = true

  ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
  cert_file = "/etc/opt/certs/nomad/nomad.pem"
  key_file  = "/etc/opt/certs/nomad/nomad-key.pem"

  verify_server_hostname = true
  verify_https_client    = true
}

ui {
  enabled =  true
  label {
   text =  "πŸ’™πŸ’› FenerbaΓ§he 1907 πŸ’›πŸ’™"
   background_color = "#163962"
   text_color = "##ffed00"
  }
  consul {
    ui_url = "https://consul.cloud.private"
  }

  vault {
    ui_url = "https://vault.cloud.private"
  }
}

consul{
 ssl= true
 address = "127.0.0.1:8501"
 grpc_address = "127.0.0.1:8503"
 # this works only with ACL enabled
 allow_unauthenticated= true
 ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
 grpc_ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
 cert_file = "/etc/opt/certs/consul/consul.pem"
 key_file  = "/etc/opt/certs/consul/consul-key.pem"
}

telemetry {
  collection_interval = "1s"
  disable_hostname = true
  prometheus_metrics = true
  publish_allocation_metrics = true
  publish_node_metrics = true
}

consul agent

datacenter = "nomadder1"
data_dir =  "/opt/services/core/consul/data"
log_level = "INFO"
node_name = "worker-01"
bind_addr = "0.0.0.0"
advertise_addr = "172.42.1.20"
client_addr = "0.0.0.0"
encrypt = "G1CHAD7wwu0tU28BlKkirSahTJ/Tqpo9ClOAycQAUwE="

addresses {
  #  grpc = "127.0.0.1"
    grpc_tls = "127.0.0.1"
}
ports {
    http = -1
    https = 8501
  #  grpc = 8502
    grpc_tls = 8503
}
connect {
     enabled = true
}
retry_join =  ["172.42.1.10"]

auto_encrypt{
    tls = true
}
performance{
    raft_multiplier = 1
}

node_meta{
  node_type = "worker"
}
tls{
    defaults {
        ca_file = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
        cert_file = "/etc/opt/certs/consul/consul.pem"
        key_file = "/etc/opt/certs/consul/consul-key.pem"
        verify_incoming = false
        verify_outgoing = true
    }
    internal_rpc {
        verify_server_hostname = true
    }
}
#watches = [
#  {
#    type = "checks"
#    handler = "/usr/bin/health-check-handler.sh"
#  }
#]

telemetry {
  disable_hostname = true

nomad agent

log_level = "DEBUG"
name = "worker-01"
datacenter = "nomadder1"
data_dir =  "/opt/services/core/nomad/data"
bind_addr = "0.0.0.0" # the default

leave_on_interrupt= true
#https://github.com/hashicorp/nomad/issues/17093
#systemctl kill -s SIGTERM nomad will suppress node drain if
#leave_on_terminate set to false
leave_on_terminate = true

advertise {
  # Defaults to the first private IP address.
  http = "172.42.1.20"
  rpc  = "172.42.1.20"
  serf = "172.42.1.20"
}
client {
  enabled = true
  network_interface = "eth1"
  meta {
    node_type= "worker"
    connect.log_level = "debug"
    connect.sidecar_image= "registry.cloud.private/envoyproxy/envoy:v1.29.0"
  }
  server_join {
    retry_join =  ["172.42.1.10"]
    retry_max = 0
    retry_interval = "15s"
  }
  # Either leave_on_interrupt or leave_on_terminate must be set
  # for this to take effect.
  drain_on_shutdown {
    deadline           = "2m"
    force              = false
    ignore_system_jobs = false
  }
  host_volume "ca_cert" {
    path      = "/usr/local/share/ca-certificates/cloudlocal"
    read_only = true
  }
  host_volume "cert_ingress" {
    path      = "/etc/opt/certs/ingress"
    read_only = true
  }
  ## Cert consul client
  ## Needed for consul_sd_configs
  ## Should be deleted after resolve https://github.com/suikast42/nomadder/issues/100
  host_volume "cert_consul" {
    path      = "/etc/opt/certs/consul"
    read_only = true
  }

  ## Cert consul client
  ## Needed for jenkins
  ## Should be deleted after resolve https://github.com/suikast42/nomadder/issues/100
  host_volume "cert_nomad" {
    path      = "/etc/opt/certs/nomad"
    read_only = true
  }

  ## Cert docker client
  ## Needed for jenkins
  ## Should be deleted after migrating to vault
  host_volume "cert_docker" {
    path      = "/etc/opt/certs/docker"
    read_only = true
  }

  host_network "public" {
    interface = "eth0"
    #cidr = "203.0.113.0/24"
    #reserved_ports = "22,80"
  }
  host_network "default" {
      interface = "eth1"
  }
  host_network "private" {
    interface = "eth1"
  }
  host_network "local" {
    interface = "lo"
  }

  reserved {
  # cpu (int: 0) - Specifies the amount of CPU to reserve, in MHz.
  # cores (int: 0) - Specifies the number of CPU cores to reserve.
  # memory (int: 0) - Specifies the amount of memory to reserve, in MB.
  # disk (int: 0) - Specifies the amount of disk to reserve, in MB.
  # reserved_ports (string: "") - Specifies a comma-separated list of ports to reserve on all fingerprinted network devices. Ranges can be specified by using a hyphen separating the two inclusive ends. See also host_network for reserving ports on specific host networks.
    cpu    = 1000
    memory = 2048
  }
  max_kill_timeout  = "1m"
}

tls {
  http = true
  rpc  = true

  ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
  cert_file = "/etc/opt/certs/nomad/nomad.pem"
  key_file  = "/etc/opt/certs/nomad/nomad-key.pem"

  verify_server_hostname = true
  verify_https_client    = true
}

consul{
  ssl= true
  address = "127.0.0.1:8501"
  grpc_address = "127.0.0.1:8503"
  # this works only with ACL enabled
  allow_unauthenticated= true
  ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
  grpc_ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
  cert_file = "/etc/opt/certs/consul/consul.pem"
  key_file  = "/etc/opt/certs/consul/consul-key.pem"
}

telemetry {
  collection_interval = "1s"
  disable_hostname = true
  prometheus_metrics = true
  publish_allocation_metrics = true
  publish_node_metrics = true
}

plugin "docker" {
  config {
    allow_privileged = false
    disable_log_collection  = false
#    volumes {
#      enabled = true
#      selinuxlabel = "z"
#    }
    infra_image = "registry.cloud.private/google_containers/pause-amd64:3.2"
    infra_image_pull_timeout ="30m"
    extra_labels = ["job_name", "job_id", "task_group_name", "task_name", "namespace", "node_name", "node_id"]
    logging {
      type = "journald"
       config {
          labels-regex =".*"
       }
    }
    gc{
      container = true
      dangling_containers{
        enabled = true
      # period = "3m"
      # creation_grace = "5m"
      }
    }

  }
}
lgfa29 commented 8 months ago

Thank you for the extra information @suikast42!

The server logs allowed me to find the problem. I believe you have service job preemption enabled, which triggered a different code path from the default configuration I was using. I opened #19933 to fix this issue.

To confirm that this is the case, could you share the output of the command nomad operator scheduler get-config?

suikast42 commented 8 months ago

Interesting πŸ‘Œ

Here is the output. By the way I updated to 1.7.4 But nothing changed of course πŸ˜‚

Scheduler Algorithm           = spread
Memory Oversubscription       = true
Reject Job Registration       = false
Pause Eval Broker             = false
Preemption System Scheduler   = true
Preemption Service Scheduler  = true
Preemption Batch Scheduler    = true
Preemption SysBatch Scheduler = true
Modify Index                  = 30913
lgfa29 commented 8 months ago

Thanks! Yeah, Preemption Service Scheduler = true would trigger this. The fix will be available in the next Nomad release.

Thank you again for the report!

suikast42 commented 8 months ago

I can confirm

After setting nomad operator scheduler set-config -preempt-service-scheduler false I see the detail ;-)

image

suikast42 commented 8 months ago

Thanks! Yeah, Preemption Service Scheduler = true would trigger this. The fix will be available in the next Nomad release.

Thank you again for the report!

Yes I do this beacuase I activate MemoryOversubscription. Thus I thought that's more dynmamic for my usecase 😁

lgfa29 commented 8 months ago

Oh yes, preemption is a very nice feature. But it triggers some different code paths that sometimes are not kept up-to-date 😬

But I'm glad we were able to get to the bottom of this. I was really confused why it wasn't happening to me πŸ˜