hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.89k stars 1.95k forks source link

Modifying job count kills existing containers #156

Closed jweissig closed 9 years ago

jweissig commented 9 years ago

Congrats on the launch (looks very cool so far)!!

Working through the Vagrant / Docker example tutorial at https://www.nomadproject.io/intro/getting-started/jobs.html and found a strange issue.

Docs mention, "It is idempotent to run the same job specification again and no new allocations will be created", but this does not appear to be working.

Scaled to three redis instances, then re-ran “nomad run example.nomad”, rather than doing nothing, the instances are killed and replaced. I noticed something similar when scaling from 1 to 3 redis instances. The first instance was killed and replaced (vs just adding two new instances to make the desired three). Here are the logs. Notice the docker container names are replaced same with the Nomad IDs.

vagrant@nomad:~$ nomad version
Nomad v0.1.0

vagrant@nomad:~$ nomad run example.nomad 
==> Monitoring evaluation "241d5762-8cea-72f1-be25-ddc65a8312d5"
    Evaluation triggered by job "example"
    Allocation "7da01329-9b06-6b90-4562-0209ccb0cb23" created: node "fa365198-e53c-4049-8eb1-d7435ab154bf", group "cache"
    Allocation "20f54896-a5f6-7283-d7b3-1570099297c5" created: node "fa365198-e53c-4049-8eb1-d7435ab154bf", group "cache"
    Allocation "56f62650-b58a-c37b-8687-6838a3272f7f" created: node "fa365198-e53c-4049-8eb1-d7435ab154bf", group "cache"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "241d5762-8cea-72f1-be25-ddc65a8312d5" finished with status "complete"

vagrant@nomad:~$ sudo docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
fa2e3a983f7d        redis:latest        "/entrypoint.sh redis"   14 seconds ago      Up 12 seconds       6379/tcp            reverent_bardeen
48bbe1485a62        redis:latest        "/entrypoint.sh redis"   14 seconds ago      Up 12 seconds       6379/tcp            boring_panini
7756c718ef5b        redis:latest        "/entrypoint.sh redis"   14 seconds ago      Up 12 seconds       6379/tcp            suspicious_ramanujan

vagrant@nomad:~$ nomad status example
ID          = example
Name        = example
Type        = service
Priority    = 50
Datacenters = dc1
Status      = <none>

==> Evaluations
ID                                    Priority  TriggeredBy     Status
241d5762-8cea-72f1-be25-ddc65a8312d5  50        job-register    complete

==> Allocations
ID                                    EvalID                                NodeID                                TaskGroup  Desired  Status
56f62650-b58a-c37b-8687-6838a3272f7f  241d5762-8cea-72f1-be25-ddc65a8312d5  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      run      running
7da01329-9b06-6b90-4562-0209ccb0cb23  241d5762-8cea-72f1-be25-ddc65a8312d5  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      run      running
20f54896-a5f6-7283-d7b3-1570099297c5  241d5762-8cea-72f1-be25-ddc65a8312d5  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      run      running

vagrant@nomad:~$ nomad run example.nomad 
==> Monitoring evaluation "890a453b-4db2-4206-08dd-709925b7994f"
    Evaluation triggered by job "example"
    Allocation "17cbe659-1688-8d54-8f54-0f04549c32e3" created: node "fa365198-e53c-4049-8eb1-d7435ab154bf", group "cache"
    Allocation "2e067500-9084-e62f-ebe9-1a9010b00acc" created: node "fa365198-e53c-4049-8eb1-d7435ab154bf", group "cache"
    Allocation "cf6a6165-53d2-4126-0132-7a986be06468" created: node "fa365198-e53c-4049-8eb1-d7435ab154bf", group "cache"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "890a453b-4db2-4206-08dd-709925b7994f" finished with status "complete"

vagrant@nomad:~$ nomad status example
ID          = example
Name        = example
Type        = service
Priority    = 50
Datacenters = dc1
Status      = <none>

==> Evaluations
ID                                    Priority  TriggeredBy     Status
890a453b-4db2-4206-08dd-709925b7994f  50        job-register    complete
241d5762-8cea-72f1-be25-ddc65a8312d5  50        job-register    complete

==> Allocations
ID                                    EvalID                                NodeID                                TaskGroup  Desired  Status
2e067500-9084-e62f-ebe9-1a9010b00acc  890a453b-4db2-4206-08dd-709925b7994f  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      run      running
17cbe659-1688-8d54-8f54-0f04549c32e3  890a453b-4db2-4206-08dd-709925b7994f  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      run      running
cf6a6165-53d2-4126-0132-7a986be06468  890a453b-4db2-4206-08dd-709925b7994f  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      run      running
56f62650-b58a-c37b-8687-6838a3272f7f  241d5762-8cea-72f1-be25-ddc65a8312d5  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      stop     dead
20f54896-a5f6-7283-d7b3-1570099297c5  241d5762-8cea-72f1-be25-ddc65a8312d5  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      stop     dead
7da01329-9b06-6b90-4562-0209ccb0cb23  241d5762-8cea-72f1-be25-ddc65a8312d5  fa365198-e53c-4049-8eb1-d7435ab154bf  cache      stop     dead

vagrant@nomad:~$ sudo docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
aee5788539b4        redis:latest        "/entrypoint.sh redis"   2 minutes ago       Up 2 minutes        6379/tcp            fervent_ptolemy
f18817f06dd4        redis:latest        "/entrypoint.sh redis"   2 minutes ago       Up 2 minutes        6379/tcp            happy_yalow
ac8f83e474ff        redis:latest        "/entrypoint.sh redis"   2 minutes ago       Up 2 minutes        6379/tcp            desperate_bardeen
mzupan commented 9 years ago

I see this also.. not only that is once killed it takes 19 seconds to load new containers.

https://gist.github.com/mzupan/53a1a7dcb8c4882df20d

mauilion commented 9 years ago

This appears to be a behavior in -dev not in prod. In prod we see that the existing container is left to run. https://gist.github.com/mauilion/8af36109a466b58e39db

ryanuber commented 9 years ago

@catsby noticed the same - seems isolated to dev mode currently. Will take a look, thanks!

armon commented 9 years ago

Fixed by 69e7d21

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.