hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.86k stars 1.95k forks source link

template: Allocation unexpectedly restarted when scaling another group of the Nomad job #16846

Closed cyrilgdn closed 1 year ago

cyrilgdn commented 1 year ago

Nomad version

Nomad v1.5.3
BuildDate 2023-04-04T20:09:50Z
Revision 434f7a1745c6304d607562daa9a4a635def7153f

We encounter this problem since Nomad 1.4

Issue

On a Nomad job with 2 groups, if a task of the first group has a template with Vault dynamic credentials, the associated allocation will be unexpectedly restarted (actually the template will be re-rendered, then it depends on the template's change_mode) if we scale the second group.

This is a big issue for us as we use the Nomad autoscaler with scaling stanza, so every time it automatically scales a job (with multiple group), allocations that are not linked to the scaling policy are restarted.

My 2 cents

After debugging a bit and checking the diff between Nomad 1.3.8 and Nomad 1.4 or above, I believe it could be linked to workload identity and the new NomadToken added in https://github.com/hashicorp/nomad/pull/13000/ (cc @tgross)

In template_hook.go https://github.com/hashicorp/nomad/blob/fc75e9d117f38ad2729fb55a865a296b8517f9ae/client/allocrunner/taskrunner/template_hook.go#L200-L205

The update hook now checks that this new token has changed in addition to the Vault one and it seems that this token changes at every scaling event (every deployment I guess), so it restarts consul-template manager and as we have dynamic credentials, the rendered template changes and the task is restarted.

(I've also tested with static secrets from Vault; in this case the template manager is restarted but as the rendered template is the same than the previous one, the task is not restarted)

Not sure but I hope it could help.

Reproduction steps

We need a Nomad cluster already setup and linked to a Vault cluster with at least one dynamic secrets engine. Nomad version > 1.4.x

We deploy the following Nomad job:

job "test_cyril" {
  datacenters = ["aws-eu-central-1"]

  type = "service"

  vault {
    env = false
    policies = [
      "nomad/job-exec",
    ]
  }

  group "nginx" {
    task "nginx" {
      driver = "docker"

      identity {
        file = true
      }

      config {
        image = "nginx:alpine"
      }

      template {
        destination = "secrets/test.conf"

        data = "{{ with secret \"nomad/creds/job-exec\" }}{{ .Data.secret_id }}{{ end }}"
      }
    }
  }

  group "nginx2" {
    task "nginx2" {
      driver = "docker"

      config {
        image = "nginx:alpine"
      }
    }
  }
}

Note In this example we use a Nomad secrets engine (vault read nomad/creds/job-exec) to avoid adding another piece in the test stack. In production we actually encounter the problem with all our databases secrets engine.

Once the deployment is successful, we have the following allocations:

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
3ae57a27  a18eb888  nginx       0        run      running  4m39s ago  4m28s ago
e8c4775f  a18eb888  nginx2      0        run      running  4m39s ago  4m28s ago

Then we scale the second group nginx2:

$ nomad job scale test_cyril nginx2 2

The deployment is successful and we have our new allocations

$ nomad status d57f1419
[...]
Task Events:
Started At     = 2023-04-11T16:04:11Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created     Modified
ba7f52ba  a18eb888  nginx2      1        run      running  5m22s ago   5m11s ago
3ae57a27  a18eb888  nginx       1        run      running  10m19s ago  5m10s ago
e8c4775f  a18eb888  nginx2      1        run      running  10m19s ago  5m12s ago

But depending on the Nomad client version, we have a different behavior for the allocation of the group nginx (not the one that we have scaled)

Expected Result

This is the behavior we have if the Nomad client version is <= 1.3.8

The allocation of the first group nginx shouldn't have been impacted.

This is an example of another deployment of the same job on a Nomad 1.3.8 client:

Recent Events:
Time                       Type        Description
2023-04-11T18:04:11+02:00  Started     Task started by client
2023-04-11T18:04:11+02:00  Task Setup  Building Task Directory
2023-04-11T18:04:10+02:00  Received    Task received by client

Actual Result

This is the behavior we have with a Nomad client >= 1.4.0

The allocation of the group nginx has been restarted because the template has been re-rendered:

$ nomad status 3ae57a27 
[...]
Task Events:
Started At     = 2023-04-11T15:40:18Z
Finished At    = N/A
Total Restarts = 1
Last Restart   = 2023-04-11T15:40:18Z

Recent Events:
Time                       Type              Description
2023-04-11T17:40:18+02:00  Started           Task started by client
2023-04-11T17:40:18+02:00  Restarting        Task restarting in 0s
2023-04-11T17:40:18+02:00  Terminated        Exit Code: 0
2023-04-11T17:40:17+02:00  Restart Signaled  Template with change_mode restart re-rendered
2023-04-11T17:35:19+02:00  Started           Task started by client
2023-04-11T17:35:18+02:00  Task Setup        Building Task Directory
2023-04-11T17:35:18+02:00  Received          Task received by client

Identity token

This is the (decoded) token I got from secrets/nomad_token thanks to the identity stanza with:

nomad alloc exec 3ae57a27 cat /secrets/nomad_token | xargs -I TOKEN python -c "import jwt; import json; print(json.dumps(jwt.decode('TOKEN', options={'verify_signature': False})))" | jq .` 

Before job scaling

{
  "nomad_namespace": "default",
  "nomad_job_id": "test_cyril",
  "nomad_allocation_id": "3ae57a27-51fb-15d4-973f-90ad3e30d44e",
  "nomad_task": "nginx",
  "nbf": 1681227318,
  "iat": 1681227318
}

After job scaling:

{
  "nomad_namespace": "default",
  "nomad_job_id": "test_cyril",
  "nomad_allocation_id": "3ae57a27-51fb-15d4-973f-90ad3e30d44e",
  "nomad_task": "nginx",
  "nbf": 1681227615,
  "iat": 1681227615
}

We can see that nbf and iat changed.

Nomad Client logs (if appropriate)

2023-04-11T15:40:15.821Z [TRACE] client: next heartbeat: period=15.459675068s
2023-04-11T15:40:15.836Z [DEBUG] client: allocation updates: added=1 removed=0 updated=2 ignored=1
2023-04-11T15:40:15.838Z [TRACE] client: next heartbeat: period=10.256109816s
2023-04-11T15:40:15.839Z [TRACE] client.alloc_runner: running update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e start="2023-04-11 15:40:15.838987518 +0000 UTC m=+8527.211879260"
2023-04-11T15:40:15.839Z [TRACE] client.alloc_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=alloc_health_watcher start="2023-04-11 15:40:15.8390903 +0000 UTC m=+8527.211982042"
2023-04-11T15:40:15.840Z [TRACE] client.alloc_runner: running update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc start="2023-04-11 15:40:15.840657988 +0000 UTC m=+8527.213549700"
2023-04-11T15:40:15.840Z [TRACE] client.alloc_runner: running update hook: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=alloc_health_watcher start="2023-04-11 15:40:15.840687408 +0000 UTC m=+8527.213579110"
2023-04-11T15:40:15.842Z [TRACE] client.alloc_runner.task_coordinator: state transition: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b from=init to=init
2023-04-11T15:40:15.843Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 type=Received msg="Task received by client" failed=false
2023-04-11T15:40:15.843Z [DEBUG] client: allocation updates applied: added=1 removed=0 updated=2 ignored=1 errors=0
2023-04-11T15:40:15.843Z [TRACE] client.alloc_runner: running pre-run hooks: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b start="2023-04-11 15:40:15.843183623 +0000 UTC m=+8527.216075334"
2023-04-11T15:40:15.843Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=alloc_dir start="2023-04-11 15:40:15.843211793 +0000 UTC m=+8527.216103495"
2023-04-11T15:40:15.843Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=alloc_dir end="2023-04-11 15:40:15.843358716 +0000 UTC m=+8527.216250428" duration="146.933µs"
2023-04-11T15:40:15.843Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=cgroup start="2023-04-11 15:40:15.843375966 +0000 UTC m=+8527.216267678"
2023-04-11T15:40:15.843Z [TRACE] client.cpuset.v2: add allocation: name=test_cyril.nginx2[1] id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner.runner_hook.alloc_health_watcher: watching: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e deadline="2023-04-11 15:45:15.844115609 +0000 UTC m=+8827.217007311" checks=true min_healthy_time=10s
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=alloc_health_watcher end="2023-04-11 15:40:15.844208701 +0000 UTC m=+8527.217100413" duration=5.118371ms
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=group_services start="2023-04-11 15:40:15.844223731 +0000 UTC m=+8527.217115433"
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=group_services end="2023-04-11 15:40:15.844276192 +0000 UTC m=+8527.217167894" duration="52.461µs"
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=consul_grpc_socket start="2023-04-11 15:40:15.844285142 +0000 UTC m=+8527.217176844"
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=consul_grpc_socket end="2023-04-11 15:40:15.844289852 +0000 UTC m=+8527.217181564" duration="4.72µs"
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=consul_http_socket start="2023-04-11 15:40:15.844295742 +0000 UTC m=+8527.217187454"
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=consul_http_socket end="2023-04-11 15:40:15.844300412 +0000 UTC m=+8527.217192114" duration="4.66µs"
2023-04-11T15:40:15.844Z [TRACE] client.alloc_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=checks_hook start="2023-04-11 15:40:15.844305712 +0000 UTC m=+8527.217197425"
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner.runner_hook.alloc_health_watcher: watching: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc deadline="2023-04-11 15:45:15.845634356 +0000 UTC m=+8827.218526058" checks=true min_healthy_time=10s
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=alloc_health_watcher end="2023-04-11 15:40:15.845709957 +0000 UTC m=+8527.218601659" duration=5.022549ms
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner: running update hook: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=group_services start="2023-04-11 15:40:15.845724388 +0000 UTC m=+8527.218616090"
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=group_services end="2023-04-11 15:40:15.845791889 +0000 UTC m=+8527.218683601" duration="67.511µs"
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner: running update hook: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=consul_grpc_socket start="2023-04-11 15:40:15.845807679 +0000 UTC m=+8527.218699421"
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=consul_grpc_socket end="2023-04-11 15:40:15.845818319 +0000 UTC m=+8527.218710021" duration="10.6µs"
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner: running update hook: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=consul_http_socket start="2023-04-11 15:40:15.845828239 +0000 UTC m=+8527.218719952"
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=consul_http_socket end="2023-04-11 15:40:15.845837499 +0000 UTC m=+8527.218729212" duration="9.26µs"
2023-04-11T15:40:15.845Z [TRACE] client.alloc_runner: running update hook: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=checks_hook start="2023-04-11 15:40:15.84584796 +0000 UTC m=+8527.218739672"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e name=checks_hook end="2023-04-11 15:40:15.84701954 +0000 UTC m=+8527.219911242" duration=2.713817ms
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e end="2023-04-11 15:40:15.847040691 +0000 UTC m=+8527.219932393" duration=8.053133ms
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: running update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx start="2023-04-11 15:40:15.847051071 +0000 UTC m=+8527.219942773"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx name=identity start="2023-04-11 15:40:15.847095042 +0000 UTC m=+8527.219986754"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx name=identity end="2023-04-11 15:40:15.847129813 +0000 UTC m=+8527.220021515" duration="34.761µs"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx name=template start="2023-04-11 15:40:15.847159063 +0000 UTC m=+8527.220050765"
2023-04-11T15:40:15.847Z [INFO]  agent: (runner) stopping
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) stopping watcher
2023-04-11T15:40:15.847Z [DEBUG] agent: (watcher) stopping all views
2023-04-11T15:40:15.847Z [TRACE] agent: (watcher) stopping vault.read(nomad/creds/job-exec)
2023-04-11T15:40:15.847Z [INFO]  agent: (runner) creating new runner (dry: false, once: false)
2023-04-11T15:40:15.847Z [TRACE] agent: (view) vault.read(nomad/creds/job-exec) stopping poll (received on view stopCh)
2023-04-11T15:40:15.847Z [INFO]  agent: (runner) received finish
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) final config: {"Consul":{"Address":"127.0.0.1:8500","Namespace":"","Auth":{"Enabled":false,"Username":""},"Retry":{"Attempts":12,"Backoff":250000000,"MaxBackoff":60000000000,"Enabled":true},"SSL":{"CaCert":"","CaPath":"","Cert":"","Enabled":fal>
2023-04-11T15:40:15.847Z [INFO]  agent: (runner) creating watcher
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx name=template end="2023-04-11 15:40:15.847476368 +0000 UTC m=+8527.220368070" duration="317.305µs"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx name=task_services start="2023-04-11 15:40:15.847515179 +0000 UTC m=+8527.220406891"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx name=task_services end="2023-04-11 15:40:15.847527209 +0000 UTC m=+8527.220418922" duration="12.031µs"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: running update hook: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx name=script_checks start="2023-04-11 15:40:15.84756151 +0000 UTC m=+8527.220453222"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx name=script_checks end="2023-04-11 15:40:15.84757247 +0000 UTC m=+8527.220464182" duration="10.96µs"
2023-04-11T15:40:15.847Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=3ae57a27-51fb-15d4-973f-90ad3e30d44e task=nginx end="2023-04-11 15:40:15.84758233 +0000 UTC m=+8527.220474043" duration="531.27µs"
2023-04-11T15:40:15.847Z [INFO]  agent: (runner) starting
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) running initial templates
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) initiating run
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) checking template b961933a0a2c540239dcda3a283604e0
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) missing data for 1 dependencies
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) missing dependency: vault.read(nomad/creds/job-exec)
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) add used dependency vault.read(nomad/creds/job-exec) to missing since isLeader but do not have a watcher
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) was not watching 1 dependencies
2023-04-11T15:40:15.847Z [DEBUG] agent: (watcher) adding vault.read(nomad/creds/job-exec)
2023-04-11T15:40:15.847Z [TRACE] agent: (watcher) vault.read(nomad/creds/job-exec) starting
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) diffing and updating dependencies
2023-04-11T15:40:15.847Z [DEBUG] agent: (runner) watching 1 dependencies
2023-04-11T15:40:15.847Z [TRACE] agent: (view) vault.read(nomad/creds/job-exec) starting fetch
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc name=checks_hook end="2023-04-11 15:40:15.848435625 +0000 UTC m=+8527.221327327" duration=2.587655ms
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc end="2023-04-11 15:40:15.848454456 +0000 UTC m=+8527.221346168" duration=7.796468ms
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner.task_runner: running update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc task=nginx2 start="2023-04-11 15:40:15.848476136 +0000 UTC m=+8527.221367838"
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner.task_runner: running update hook: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc task=nginx2 name=identity start="2023-04-11 15:40:15.848530197 +0000 UTC m=+8527.221421909"
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc task=nginx2 name=identity end="2023-04-11 15:40:15.848542298 +0000 UTC m=+8527.221434000" duration="12.091µs"
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner.task_runner: running update hook: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc task=nginx2 name=task_services start="2023-04-11 15:40:15.848577988 +0000 UTC m=+8527.221469690"
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc task=nginx2 name=task_services end="2023-04-11 15:40:15.848588738 +0000 UTC m=+8527.221480450" duration="10.76µs"
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner.task_runner: running update hook: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc task=nginx2 name=script_checks start="2023-04-11 15:40:15.848620039 +0000 UTC m=+8527.221511741"
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc task=nginx2 name=script_checks end="2023-04-11 15:40:15.848630379 +0000 UTC m=+8527.221522091" duration="10.35µs"
2023-04-11T15:40:15.848Z [TRACE] client.alloc_runner.task_runner: finished update hooks: alloc_id=e8c4775f-7b87-96d6-c93a-2a2bcf8c80cc task=nginx2 end="2023-04-11 15:40:15.848640319 +0000 UTC m=+8527.221532031" duration="164.193µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=cgroup end="2023-04-11 15:40:15.87359643 +0000 UTC m=+8527.246488163" duration=30.220485ms
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=await_previous_allocations start="2023-04-11 15:40:15.873620371 +0000 UTC m=+8527.246512083"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=await_previous_allocations end="2023-04-11 15:40:15.873628991 +0000 UTC m=+8527.246520693" duration="8.61µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=migrate_disk start="2023-04-11 15:40:15.873636881 +0000 UTC m=+8527.246528593"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=migrate_disk end="2023-04-11 15:40:15.873644221 +0000 UTC m=+8527.246535933" duration="7.34µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=alloc_health_watcher start="2023-04-11 15:40:15.873651331 +0000 UTC m=+8527.246543043"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner.runner_hook.alloc_health_watcher: watching: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b deadline="2023-04-11 15:45:15.873657802 +0000 UTC m=+8827.246549513" checks=true min_healthy_time=10s
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=alloc_health_watcher end="2023-04-11 15:40:15.873717943 +0000 UTC m=+8527.246609654" duration="66.611µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=network start="2023-04-11 15:40:15.873727063 +0000 UTC m=+8527.246618765"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=network end="2023-04-11 15:40:15.873734053 +0000 UTC m=+8527.246625755" duration="6.99µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=group_services start="2023-04-11 15:40:15.873741273 +0000 UTC m=+8527.246632985"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=group_services end="2023-04-11 15:40:15.873748303 +0000 UTC m=+8527.246640005" duration="7.02µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=consul_grpc_socket start="2023-04-11 15:40:15.873758793 +0000 UTC m=+8527.246650505"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=consul_grpc_socket end="2023-04-11 15:40:15.873765573 +0000 UTC m=+8527.246657286" duration="6.781µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=consul_http_socket start="2023-04-11 15:40:15.873773213 +0000 UTC m=+8527.246664915"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=consul_http_socket end="2023-04-11 15:40:15.873779414 +0000 UTC m=+8527.246671126" duration="6.211µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=csi_hook start="2023-04-11 15:40:15.873786814 +0000 UTC m=+8527.246678526"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=csi_hook end="2023-04-11 15:40:15.873793354 +0000 UTC m=+8527.246685066" duration="6.54µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: running pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=checks_hook start="2023-04-11 15:40:15.873800694 +0000 UTC m=+8527.246692406"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b name=checks_hook end="2023-04-11 15:40:15.873807374 +0000 UTC m=+8527.246699086" duration="6.68µs"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: finished pre-run hooks: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b end="2023-04-11 15:40:15.873814414 +0000 UTC m=+8527.246706116" duration=30.630782ms
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: handling task state update: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b done=false
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner.task_coordinator: state transition: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b from=init to=prestart
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner.task_coordinator: state transition: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b from=prestart to=main
2023-04-11T15:40:15.873Z [DEBUG] client.alloc_runner.task_runner: lifecycle start condition has been met, proceeding: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner.task_runner: running prestart hooks: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 start="2023-04-11 15:40:15.873882506 +0000 UTC m=+8527.246774208"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=validate start="2023-04-11 15:40:15.873923216 +0000 UTC m=+8527.246814918"
2023-04-11T15:40:15.873Z [TRACE] client.alloc_runner: sending updated alloc: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b client_status=pending desired_status=""
2023-04-11T15:40:15.875Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=validate end="2023-04-11 15:40:15.875596726 +0000 UTC m=+8527.248488438" duration=1.67352ms
2023-04-11T15:40:15.875Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=task_dir start="2023-04-11 15:40:15.875636946 +0000 UTC m=+8527.248528648"
2023-04-11T15:40:15.875Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 type="Task Setup" msg="Building Task Directory" failed=false
2023-04-11T15:40:15.876Z [TRACE] client.alloc_runner: handling task state update: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b done=false
2023-04-11T15:40:15.876Z [TRACE] client.alloc_runner: sending updated alloc: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b client_status=pending desired_status=""
2023-04-11T15:40:15.878Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=task_dir end="2023-04-11 15:40:15.878424126 +0000 UTC m=+8527.251315838" duration=2.78719ms
2023-04-11T15:40:15.878Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=identity start="2023-04-11 15:40:15.878481867 +0000 UTC m=+8527.251373609"
2023-04-11T15:40:15.879Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=identity end="2023-04-11 15:40:15.879857101 +0000 UTC m=+8527.252748833" duration=1.375224ms
2023-04-11T15:40:15.879Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=logmon start="2023-04-11 15:40:15.879968573 +0000 UTC m=+8527.252860315"
2023-04-11T15:40:15.880Z [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: starting plugin: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 path=/usr/local/bin/nomad args=["/usr/local/bin/nomad", "logmon"]
2023-04-11T15:40:15.880Z [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin started: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 path=/usr/local/bin/nomad pid=66127
2023-04-11T15:40:15.880Z [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: waiting for RPC address: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 path=/usr/local/bin/nomad
2023-04-11T15:40:15.881Z [TRACE] agent: vault.read(nomad/creds/job-exec): GET /v1/nomad/creds/job-exec
2023-04-11T15:40:15.894Z [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.nomad: plugin address: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 @module=logmon address=/tmp/plugin4284073372 network=unix timestamp=2023-04-11T15:40:15.894Z
2023-04-11T15:40:15.894Z [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: using plugin: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 version=2
2023-04-11T15:40:15.895Z [TRACE] client.alloc_runner.task_runner.task_hook.logmon.stdio: waiting for stdio data: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2
2023-04-11T15:40:15.895Z [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 @module=logmon path=/srv/nomad/data/alloc/ba7f52ba-e773-2d67-6b1c-d0ce446bf83b/alloc/logs/.nginx2.stdout.fifo timestamp=2023-04-11T1>
2023-04-11T15:40:15.895Z [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 path=/srv/nomad/data/alloc/ba7f52ba-e773-2d67-6b1c-d0ce446bf83b/alloc/logs/.nginx2.stderr.fifo @module=logmon timestamp=2023-04-11T1>
2023-04-11T15:40:15.897Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=logmon end="2023-04-11 15:40:15.897619155 +0000 UTC m=+8527.270510857" duration=17.650542ms
2023-04-11T15:40:15.897Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=dispatch_payload start="2023-04-11 15:40:15.897668796 +0000 UTC m=+8527.270560498"
2023-04-11T15:40:15.899Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=dispatch_payload end="2023-04-11 15:40:15.89959796 +0000 UTC m=+8527.272489702" duration=1.929204ms
2023-04-11T15:40:15.899Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=volumes start="2023-04-11 15:40:15.899693702 +0000 UTC m=+8527.272585434"
2023-04-11T15:40:15.901Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=volumes end="2023-04-11 15:40:15.901403782 +0000 UTC m=+8527.274295484" duration=1.71005ms
2023-04-11T15:40:15.901Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=artifacts start="2023-04-11 15:40:15.901439213 +0000 UTC m=+8527.274330925"
2023-04-11T15:40:15.902Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=artifacts end="2023-04-11 15:40:15.902918929 +0000 UTC m=+8527.275810631" duration=1.479706ms
2023-04-11T15:40:15.902Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=devices start="2023-04-11 15:40:15.90295332 +0000 UTC m=+8527.275845022"
2023-04-11T15:40:15.904Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=devices end="2023-04-11 15:40:15.904572098 +0000 UTC m=+8527.277463800" duration=1.618778ms
2023-04-11T15:40:15.904Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=api start="2023-04-11 15:40:15.904606559 +0000 UTC m=+8527.277498271"
2023-04-11T15:40:15.906Z [TRACE] client.alloc_runner.task_runner: finished prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=api end="2023-04-11 15:40:15.906189577 +0000 UTC m=+8527.279081279" duration=1.583008ms
2023-04-11T15:40:15.906Z [TRACE] client.alloc_runner.task_runner: running prestart hook: alloc_id=ba7f52ba-e773-2d67-6b1c-d0ce446bf83b task=nginx2 name=vault start="2023-04-11 15:40:15.906226607 +0000 UTC m=+8527.279118309"
2023-04-11T15:40:15.914Z [TRACE] agent: (view) vault.read(nomad/creds/job-exec) marking successful data response
2023-04-11T15:40:15.914Z [TRACE] agent: (view) vault.read(nomad/creds/job-exec) successful contact, resetting retries
2023-04-11T15:40:15.914Z [TRACE] agent: (view) vault.read(nomad/creds/job-exec) received data
2023-04-11T15:40:15.914Z [TRACE] agent: (view) vault.read(nomad/creds/job-exec) starting fetch
2023-04-11T15:40:15.914Z [TRACE] agent: vault.read(nomad/creds/job-exec): starting renewer
2023-04-11T15:40:15.914Z [DEBUG] agent: (runner) receiving dependency vault.read(nomad/creds/job-exec)
2023-04-11T15:40:15.914Z [DEBUG] agent: (runner) initiating run
2023-04-11T15:40:15.914Z [DEBUG] agent: (runner) checking template b961933a0a2c540239dcda3a283604e0
2023-04-11T15:40:15.915Z [DEBUG] agent: (runner) rendering "(dynamic)" => "/srv/nomad/data/alloc/3ae57a27-51fb-15d4-973f-90ad3e30d44e/nginx/secrets/test.conf"
2023-04-11T15:40:15.915Z [INFO]  agent: (runner) rendered "(dynamic)" => "/srv/nomad/data/alloc/3ae57a27-51fb-15d4-973f-90ad3e30d44e/nginx/secrets/test.conf"
2023-04-11T15:40:15.915Z [DEBUG] agent: (runner) diffing and updating dependencies
2023-04-11T15:40:15.915Z [DEBUG] agent: (runner) vault.read(nomad/creds/job-exec) is still needed
2023-04-11T15:40:15.915Z [DEBUG] agent: (runner) watching 1 dependencies
2023-04-11T15:40:15.915Z [DEBUG] agent: (runner) all templates rendered
2023-04-11T15:40:15.930Z [TRACE] agent: vault.read(nomad/creds/job-exec): successfully renewed
lgfa29 commented 1 year ago

Thanks for the detailed description @cyrilgdn!

I think your analysis make sense. I have placed this issue in our backlog for further triaging.

mgeggie commented 1 year ago

Hi @cyrilgdn and @lgfa29 I wanted to add some additional feedback on this issue. We just attempted an upgrade from Nomad 1.3.5 to 1.4.8 in production this week and also encountered this behavior in jobs on our cluster with templated Vault secrets, even for jobs with only a single group.

In our case, a change of count on the single group resulted in all allocations of the job being triggered for restart, regardless of whether the change was an increase or decrease.

We've downgraded our nomad clients back to Nomad 1.3.5 and job count changes stopped restarting all allocations.

karelorigin commented 1 year ago

I've been wracking my brain at this for a couple of days now until I, mostly by chance, figured out the correlation between restarts and scaling actions. Bit of a shame to have to downgrade again to Nomad 1.3.5. Hopefully, there'll be a solution to this soon 🤞

albertollamaso commented 1 year ago

Hello @lgfa29 @cyrilgdn

I wanted to provide some feedback on this issue. We recently attempted to make a change to the update stanza of a nomad job. However, after applying the change, we noticed that all allocations of the job were triggered for restart, which was unexpected.

Here is the relevant plan:

+/- Job: "test"
+/- Task Group: "nginx" (2 in-place update)
  +/- Update {
        AutoPromote:      "false"
        AutoRevert:       "true"
        Canary:           "0"
        HealthCheck:      "checks"
        HealthyDeadline:  "300000000000"
        MaxParallel:      "1"
    +/- MinHealthyTime:   "6000000000" => "5000000000"
        ProgressDeadline: "600000000000"
      }
      Task: "nginx"

Scheduler dry-run:
- All tasks successfully allocated.

Only by having this change resulted in all allocations of the job being triggered for restart.

Nomad version: 1.5.3

tgross commented 1 year ago

Another example: https://github.com/hashicorp/nomad/issues/17398

komapa commented 1 year ago

We have hit this problem after upgrading to 1.4.11 and all it takes to trigger templates with Vault dynamic credentials to re-render is changing the count of a job.

gulducat commented 1 year ago

Hi folks, and thanks again @cyrilgdn for the detailed write-up!

I was able to reproduce this on our release/1.6.x branch with this minimal job specification:

job "multi" {
  group "one" {
    count = 1
    task "t-1" {
      driver = "docker"
      config {
        image = "nginx:alpine"
      }
      # identity{} block not actually required,
      # because every task gets an identity.
      # this just exposes it to the task.
      #identity {
      #  file = true
      #}
      template {
        destination = "local/stamp"
        data        = "{{ timestamp }}" # anything that is dynamic
      }
    }
  }
  group "two" {
    count = 1
    task "t-2" {
      driver = "docker"
      config {
        image = "nginx:alpine"
      }
    }
  }
}

Scaling group two (with nomad job scale multi two 1) results in task t-1 in group one being restarted, for precisely the reason described in the OP -- The task gets a new NomadToken, so the template manager gets recreated, which calls some dynamic function, which results in a changed template and therefore a restart (the default change_mode).

There are a couple of ways to address this

Personally I'm leaning towards the first option, so that the server and the client would agree on which token is in use, but I'll confer with the team to see if there are implications I'm not considering, or other architectural changes that may be preferable.