hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.92k stars 1.95k forks source link

Nomad schedules more instances of system task per node after node reboot #9846

Open johnzhanghua opened 3 years ago

johnzhanghua commented 3 years ago

Nomad version

Nomad v0.12.0 (8f7fbc8e7b5a4ed0d0209968faf41b238e6d5817)

Operating system and Environment details

CentOs 7.5 VM on Virtualbox 6.1, 3 nodes cluster, server and client on same node

Issue

Nomad schedules more instances of system task per node after node reboot

Reproduction steps

Summary Task Group Queued Starting Running Failed Complete Lost test 0 0 3 0 0 0

Allocations ID Node ID Task Group Version Desired Status Created Modified 0a861162 1d358dc0 test 0 run running 26m24s ago 24m45s ago 3989bac4 93eba225 test 0 run running 26m24s ago 24m48s ago d5f19cff 8af71708 test 0 run running 26m24s ago 24m44s ago


- reboot all the 3 nodes, at the same time
- Check the status, if its still 3 active tasks(pending or running) status, like below, do the reboot 3 nodes again

nomad job status test ID = test Name = test Submit Date = 2021-01-18T23:29:56Z Type = system Priority = 50 Datacenters = dc1 Namespace = default Status = running Periodic = false Parameterized = false

Summary Task Group Queued Starting Running Failed Complete Lost test 0 1 2 0 2 0

Allocations ID Node ID Task Group Version Desired Status Created Modified 3cd842f1 1d358dc0 test 0 run running 4m54s ago 4m42s ago 0a861162 1d358dc0 test 0 run complete 32m54s ago 4m52s ago 3989bac4 93eba225 test 0 run running 32m54s ago 4m36s ago d5f19cff 8af71708 test 0 run pending 32m54s ago 43s ago


- At last, it will launch more than one task on one node, like 

nomad job status test ID = test Name = test Submit Date = 2021-01-18T23:29:56Z Type = system Priority = 50 Datacenters = dc1 Namespace = default Status = running Periodic = false Parameterized = false

Summary Task Group Queued Starting Running Failed Complete Lost test 0 2 3 0 5 0

Allocations ID Node ID Task Group Version Desired Status Created Modified 05d00f19 8af71708 test 0 run running 27m26s ago 27m26s ago ae08bf8e 1d358dc0 test 0 run pending 27m26s ago 27m26s ago 91b19a4b 93eba225 test 0 run running 30m10s ago 27m10s ago 3cd842f1 1d358dc0 test 0 run pending 36m26s ago 8s ago 0a861162 1d358dc0 test 0 run complete 1h4m ago 27m22s ago 3989bac4 93eba225 test 0 run complete 1h4m ago 27m22s ago d5f19cff 8af71708 test 0 run running 1h4m ago 27m26s ago


### Job file (if appropriate)

job "test" { datacenters = ["dc1"] type = "system"

group "test" { restart { interval = "6m" attempts = 10 delay = "10s" mode = "delay" }

# add prestart task
task "test-pre" {
  driver = "docker"
  lifecycle {
    hook = "prestart"
    sidecar = false
  }

  config {
    image = "alpine:3.8"
    command = "sh"

    args = ["-c", "echo test > /alloc/test_file"]
  }
}

task "test" {
  driver = "docker"

  config {
    image = "alpine:3.8"
    command = "sh"

    args = ["-c", "if [ ! -s /alloc/test_file ]; then sleep 5; exit 1; else while sleep 3600; do :; done; fi"]
  }
}

} }

cgbaker commented 3 years ago

thank you, @johnzhanghua , we'll look into it.

just to be sure, these nodes that you are restarting are just clients, not servers? and do you see this problem if you remove gc_max_allocs config? (can you say why you're using that config?)

johnzhanghua commented 3 years ago

The nodes both have client and server, the restart is the sudo reboot now, reboots the whole node.

The gc_max_allocs config should not be related. I saw the problem on our env with the default config, which is 50.

johnzhanghua commented 3 years ago

It's easier to reproduce the issue with more system jobs, I've tried duplicate the job files, by only changing the name, with test1, test2, test3. It ends up something like below after several node reboot.

nomad job status test2
ID            = test2
Name          = test2
Submit Date   = 2021-01-20T06:43:33Z
Type          = system
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
test2       0       1         1        0       3         0

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created     Modified
147a9998  1d358dc0  test2       0        run      pending   25m54s ago  1s ago
4e970e97  8af71708  test2       0        run      running   27m46s ago  6m36s ago
65f0ea98  1d358dc0  test2       0        run      complete  27m46s ago  6m54s ago
c9c75f7b  93eba225  test2       0        run      complete  27m46s ago  6m41s ago
nomad job status test
ID            = test
Name          = test
Submit Date   = 2021-01-18T23:29:56Z
Type          = system
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
test        0       0         2        0       13        0

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created     Modified
e3170151  93eba225  test        0        run      complete  47m13s ago  9m34s ago
f17b4b18  93eba225  test        0        stop     complete  48m48s ago  9m51s ago
4e0ac9a8  8af71708  test        0        run      running   4h4m ago    9m34s ago
027c4532  1d358dc0  test        0        run      running   1d5h ago    9m51s ago
91b19a4b  93eba225  test        0        stop     complete  1d7h ago    9m51s ago
krishicks commented 3 years ago

As an extra data point, I brought up a 1 server, 2 client virtualbox cluster with https://github.com/krishicks/vagrant-nomad, deployed the above test job as test1, test2, etc, and tried rebooting the cluster repeatedly via sudo reboot. I rebooted just the clients, as well as clients and server, and was not able to reproduce the above issue.

johnzhanghua commented 3 years ago

@krishicks Thanks. Looks you using nomad version 1.0.2.

We will try version 1.0.3. If you bring nomad down to 0.12.0, can you reproduce it ?

krishicks commented 3 years ago

I can try doing this, but you can also do it if you want to try reproducing it; just download whichever Nomad binary you want and put it in the root folder before vagrant up; it will replace the Nomad binary with the given one.

It would be really great if you could find a reproduction in that environment!

johnzhanghua commented 3 years ago

Found a similar opened issue. https://github.com/hashicorp/nomad/issues/2419