hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.83k stars 1.95k forks source link

Orphan Allocation without running process #20116

Open crystalin opened 6 months ago

crystalin commented 6 months ago

Nomad version

Nomad v1.7.5
BuildDate 2024-02-13T15:10:13Z
Revision 5f5d4646198d09b8f4f6cb90fb5d50b53fa328b8

Operating system and Environment details

Ubuntu 22

Issue

On a server running also a client: After power reboot, some allocations are still appearing as running, even if the process isn't running. The service didn't start because it uses a static port (which is already reserved according to the nomad allocation)

Trying to kill the allocation doesn't work: nomad alloc stop --namespace default -no-shutdown-delay -verbose 22434531-6a4e-1103-37f4-0f302b2b2549

==> 2024-03-11T22:09:04+01:00: Monitoring evaluation "e2847548-36a7-8604-05f9-8f7edd7dd483"
    2024-03-11T22:09:04+01:00: Evaluation triggered by job "traefik-ingress"
    2024-03-11T22:09:05+01:00: Evaluation within deployment: "12a02c6d-2c1b-62f3-e4cd-344efa115417"
    2024-03-11T22:09:05+01:00: Evaluation status changed: "pending" -> "complete"
==> 2024-03-11T22:09:05+01:00: Evaluation "e2847548-36a7-8604-05f9-8f7edd7dd483" finished with status "complete" but failed to place all allocations:
    2024-03-11T22:09:05+01:00: Task Group "traefik-proxy" (failed to place 1 allocation):
      * Constraint "${meta.gateway} = 1": 1 nodes excluded by filter
      * Resources exhausted on 1 nodes
      * Dimension "network: reserved port collision web=81" exhausted on 1 nodes
    2024-03-11T22:09:05+01:00: Evaluation "8defc7ce-bce3-99a5-5b35-f5e210d05e6f" waiting for additional capacity to place remainder
==> 2024-03-11T22:09:05+01:00: Monitoring deployment "12a02c6d-2c1b-62f3-e4cd-344efa115417"
  ⠏ Deployment "12a02c6d-2c1b-62f3-e4cd-344efa115417" in progress...

    2024-03-11T22:09:05+01:00
    ID          = 12a02c6d-2c1b-62f3-e4cd-344efa115417
    Job ID      = traefik-ingress
    Job Version = 31
    Status      = running
    Description = Deployment is running

    Deployed
    Task Group     Desired  Placed  Healthy  Unhealthy  Progress Deadline
    traefik-proxy  1        0       0        0          N/A

    Allocations
    No allocations placed

nomad alloc status --namespace default 22434531-6a4e-1103-37f4-0f302b2b2549

ID                  = 22434531-6a4e-1103-37f4-0f302b2b2549
Eval ID             = cdf352fe
Name                = traefik-ingress.traefik-proxy[0]
Node ID             = 9dbbef59
Node Name           = Server-Nomad
Job ID              = traefik-ingress
Job Version         = 24
Client Status       = running
Client Description  = Tasks are running
Desired Status      = stop
Desired Description = alloc is being migrated
Created             = 27d4h ago
Modified            = 6h38m ago
Deployment ID       = 1b134312
Deployment Health   = healthy

Allocation Addresses (mode = "host"):
Label        Dynamic  Address
*web         yes      192.168.0.3:81
*web-secure  yes      192.168.0.3:444
*postgres    yes      192.168.0.3:5432

Task "server" is "running"
Task Resources:
CPU       Memory   Disk     Addresses
2000 MHz  2.0 GiB  300 MiB

Host Volumes:
ID           Read Only
logs         false
letsencrypt  false

Task Events:
Started At     = 2024-02-29T08:22:05Z
Finished At    = N/A
Total Restarts = 6
Last Restart   = 2024-02-29T08:21:45Z

Recent Events:
Time                       Type        Description
2024-03-11T15:19:29+01:00  Killing     Sent interrupt. Waiting 5s before force killing
2024-02-29T09:22:05+01:00  Started     Task started by client
2024-02-29T09:21:45+01:00  Restarting  Task restarting in 18.554530257s
2024-02-29T09:21:45+01:00  Terminated  Exit Code: 137, Exit Message: "Docker container exited with non-zero exit code: 137"
2024-02-27T10:49:00+01:00  Started     Task started by client
2024-02-27T10:48:43+01:00  Restarting  Task restarting in 16.934377428s
2024-02-27T10:48:43+01:00  Terminated  Exit Code: 0
2024-02-23T10:49:27+01:00  Started     Task started by client
2024-02-23T10:49:09+01:00  Restarting  Task restarting in 17.58111394s
2024-02-23T10:49:09+01:00  Terminated  Exit Code: 137, Exit Message: "Docker container exited with non-zero exit code: 137"

Services

nomad service list

Service Name               Tags
prometheus                 [urlprefix-/]
vector-vector              [prometheus.io/path=/metrics,prometheus.io/scrape=true]

Reproduction steps

Have many services on a server/client combo and power off/on without proper shutdown. Run nomad alloc status --namespace default <alloc_id>

Expected Result

The allocation should get removed

Actual Result

The allocation stays forever, preventing to actually launch the service

Job file (if appropriate)

Nomad Server logs (if appropriate)

Mar 11 21:45:40 shinkandia systemd[1]: Started Nomad.
Mar 11 21:45:40 shinkandia nomad[34194]: ==> WARNING: mTLS is not configured - Nomad is not secure without mTLS!
Mar 11 21:45:40 shinkandia nomad[34194]: ==> WARNING: Bootstrap mode enabled! Potentially unsafe operation.
Mar 11 21:45:40 shinkandia nomad[34194]: ==> Loaded configuration from /etc/nomad.d/nomad.hcl
Mar 11 21:45:40 shinkandia nomad[34194]: ==> Starting Nomad agent...
Mar 11 21:45:50 shinkandia nomad[34194]: ==> Nomad agent configuration:
Mar 11 21:45:50 shinkandia nomad[34194]:        Advertise Addrs: HTTP: 192.168.0.3:4646; RPC: 192.168.0.3:4647; Serf: 192.168.0.3:4648
Mar 11 21:45:50 shinkandia nomad[34194]:             Bind Addrs: HTTP: [0.0.0.0:4646]; RPC: 0.0.0.0:4647; Serf: 0.0.0.0:4648
Mar 11 21:45:50 shinkandia nomad[34194]:                 Client: true
Mar 11 21:45:50 shinkandia nomad[34194]:              Log Level: INFO
Mar 11 21:45:50 shinkandia nomad[34194]:                Node Id: b7825dfe-721f-55e8-8886-8cab0b218999
Mar 11 21:45:50 shinkandia nomad[34194]:                 Region: global (DC: dc1)
Mar 11 21:45:50 shinkandia nomad[34194]:                 Server: true
Mar 11 21:45:50 shinkandia nomad[34194]:                Version: 1.7.5
Mar 11 21:45:50 shinkandia nomad[34194]: ==> Nomad agent started! Log data will stream in below:
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.125Z [INFO]  nomad: setting up raft bolt store: no_freelist_sync=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.126Z [INFO]  nomad.raft: starting restore from snapshot: id=44-549340-1710086188855 last-index=549340 last-term=44 size-in-bytes=2632513
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.148Z [INFO]  nomad.raft: snapshot restore progress: id=44-549340-1710086188855 last-index=549340 last-term=44 size-in-bytes=2632513 read-bytes=2632513 percent-complete="100.00%"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.148Z [INFO]  nomad.raft: restored from snapshot: id=44-549340-1710086188855 last-index=549340 last-term=44 size-in-bytes=2632513
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.186Z [INFO]  nomad.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:b0f85298-cbef-a498-5a7e-9d02221f627e Address:192.168.0.3:4647}]"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.186Z [INFO]  nomad.raft: entering follower state: follower="Node at 192.168.0.3:4647 [Follower]" leader-address= leader-id=
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.186Z [INFO]  nomad: serf: EventMemberJoin: Shinkandia-Nomad.global 192.168.0.3
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.186Z [INFO]  nomad: starting scheduling worker(s): num_workers=16 schedulers=["service", "batch", "system", "sysbatch", "_core"]
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.186Z [INFO]  nomad: started scheduling worker(s): num_workers=16 schedulers=["service", "batch", "system", "sysbatch", "_core"]
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.186Z [WARN]  nomad: serf: Failed to re-join any previously known node
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.186Z [WARN]  agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/opt/nomad/data/plugins
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.186Z [INFO]  nomad: adding server: server="Shinkandia-Nomad.global (Addr: 192.168.0.3:4647) (DC: dc1)"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.187Z [INFO]  agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.187Z [INFO]  agent: detected plugin: name=java type=driver plugin_version=0.1.0
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.187Z [INFO]  agent: detected plugin: name=docker type=driver plugin_version=0.1.0
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.187Z [INFO]  agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.187Z [INFO]  agent: detected plugin: name=exec type=driver plugin_version=0.1.0
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.187Z [INFO]  client: using state directory: state_dir=/opt/nomad/data/client
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.187Z [INFO]  client: using alloc directory: alloc_dir=/opt/nomad/data/alloc
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.187Z [INFO]  client: using dynamic ports: min=20000 max=32000 reserved=""
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.218Z [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=lo
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:40.252Z [WARN]  client.fingerprint_mgr.cni_plugins: failed to read CNI plugins directory: cni_path=/opt/cni/bin error="open /opt/cni/bin: no such file or directory"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.029Z [WARN]  nomad.raft: heartbeat timeout reached, starting election: last-leader-addr= last-leader-id=
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.029Z [INFO]  nomad.raft: entering candidate state: node="Node at 192.168.0.3:4647 [Candidate]" term=52
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.041Z [INFO]  nomad.raft: election won: term=52 tally=1
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.041Z [INFO]  nomad.raft: entering leader state: leader="Node at 192.168.0.3:4647 [Leader]"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.041Z [INFO]  nomad: cluster leadership acquired
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.361Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.453Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.453Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.453Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.453Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.454Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.454Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.454Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.454Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.454Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.616Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.690Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.719Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.720Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.720Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.720Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.720Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.759Z [INFO]  nomad: eval broker status modified: paused=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:42.759Z [INFO]  nomad: blocked evals status modified: paused=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.258Z [INFO]  client.proclib.cg2: initializing nomad cgroups: cores=0-15
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.258Z [INFO]  client.plugin: starting plugin manager: plugin-type=csi
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.258Z [INFO]  client.plugin: starting plugin manager: plugin-type=driver
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.259Z [INFO]  client.plugin: starting plugin manager: plugin-type=device
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.282Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=154dcb82-6a96-5a95-2151-8721cabb3107 task=moondata-postgres type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.283Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=154dcb82-6a96-5a95-2151-8721cabb3107 task=pgadmin4 type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.295Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.298Z [INFO]  client: node registration complete
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.300Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=22434531-6a4e-1103-37f4-0f302b2b2549 task=server type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.305Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=29adb06b-4109-b8bf-ddae-17344d65452b task=server type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.308Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=2a8a60f3-3369-20c5-8164-6cab484bc178 task=loki type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.312Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=2b06f23e-2f10-71f9-184a-d541fd0fcb05 task=server type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.315Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=2e543265-5d8f-7cd1-1370-544e936c3a1a task=server type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.317Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=32c47a99-999c-8b90-f3a1-0ff282d5365b task=homeassistant_core type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.321Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=3d60140a-f852-000b-4fd8-3844dfaade68 task=vector type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.323Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=47a144a5-655f-35a7-7f18-cbd2baefc291 task=server type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.325Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=5c12b0e9-9400-2750-0d0d-de0b0d07b691 task=vector type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.329Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=73cc812c-cd83-8856-cfd8-cef0c65ea4fe task=moondata-postgres type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.330Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=73cc812c-cd83-8856-cfd8-cef0c65ea4fe task=pgadmin4 type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.332Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=8fdb3e77-f0db-85e0-cd46-45c91dda5b6b task=moondata-postgres type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.333Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=8fdb3e77-f0db-85e0-cd46-45c91dda5b6b task=pgadmin4 type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.335Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=a4394179-bc26-8f2a-e3b0-644a2ee164dd task=vector type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.337Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=b3c1f655-4013-74fe-e504-c4505afe41af task=moondata-postgres type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.338Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=b3c1f655-4013-74fe-e504-c4505afe41af task=pgadmin4 type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.340Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=bf7c2b6e-58d0-57d5-3b4c-1f3d997e70f7 task=moonscope-server type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.344Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=c1a89e46-edcd-ba52-3e66-414085f26ab4 task=grafana type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.348Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=d4afb853-4e30-b796-2aeb-fba3177ed181 task=loki type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.351Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=e1f3e9a5-8c47-97ce-93de-6ea537493d94 task=vector type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.353Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=e2716afc-a920-f3f6-fe66-23ab3fc22e88 task=loki type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.356Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=e7eaf6c1-64b5-417d-46b9-18bbb3938735 task=cadvisor type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.359Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=e9251b24-a970-e7da-897a-998d930d9e62 task=server type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.360Z [INFO]  client: started client: node_id=9dbbef59-2ce6-3a8d-2c53-9808f6ee1b90
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.361Z [INFO]  client.gc: marking allocation for GC: alloc_id=a4394179-bc26-8f2a-e3b0-644a2ee164dd
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.361Z [INFO]  client.gc: marking allocation for GC: alloc_id=5c12b0e9-9400-2750-0d0d-de0b0d07b691
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.362Z [INFO]  client.gc: marking allocation for GC: alloc_id=bf7c2b6e-58d0-57d5-3b4c-1f3d997e70f7
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.361Z [INFO]  client.gc: marking allocation for GC: alloc_id=2b06f23e-2f10-71f9-184a-d541fd0fcb05
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.362Z [INFO]  client.gc: marking allocation for GC: alloc_id=8fdb3e77-f0db-85e0-cd46-45c91dda5b6b
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.362Z [INFO]  client.gc: marking allocation for GC: alloc_id=e7eaf6c1-64b5-417d-46b9-18bbb3938735
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.362Z [INFO]  client.gc: marking allocation for GC: alloc_id=29adb06b-4109-b8bf-ddae-17344d65452b
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.362Z [INFO]  client.gc: marking allocation for GC: alloc_id=32c47a99-999c-8b90-f3a1-0ff282d5365b
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.363Z [INFO]  client.gc: marking allocation for GC: alloc_id=2e543265-5d8f-7cd1-1370-544e936c3a1a
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.363Z [INFO]  client.gc: marking allocation for GC: alloc_id=22434531-6a4e-1103-37f4-0f302b2b2549
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.363Z [INFO]  client.gc: marking allocation for GC: alloc_id=e9251b24-a970-e7da-897a-998d930d9e62
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.364Z [INFO]  client.gc: marking allocation for GC: alloc_id=2a8a60f3-3369-20c5-8164-6cab484bc178
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.364Z [INFO]  client.gc: marking allocation for GC: alloc_id=c1a89e46-edcd-ba52-3e66-414085f26ab4
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.364Z [INFO]  client.gc: marking allocation for GC: alloc_id=e2716afc-a920-f3f6-fe66-23ab3fc22e88
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.365Z [INFO]  client.gc: marking allocation for GC: alloc_id=b3c1f655-4013-74fe-e504-c4505afe41af
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.365Z [INFO]  client.gc: marking allocation for GC: alloc_id=47a144a5-655f-35a7-7f18-cbd2baefc291
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.365Z [INFO]  client.gc: marking allocation for GC: alloc_id=d4afb853-4e30-b796-2aeb-fba3177ed181
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.365Z [INFO]  client.gc: marking allocation for GC: alloc_id=e1f3e9a5-8c47-97ce-93de-6ea537493d94
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.366Z [INFO]  client.gc: marking allocation for GC: alloc_id=73cc812c-cd83-8856-cfd8-cef0c65ea4fe
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.366Z [INFO]  client.gc: marking allocation for GC: alloc_id=3d60140a-f852-000b-4fd8-3844dfaade68
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.366Z [INFO]  client.gc: marking allocation for GC: alloc_id=154dcb82-6a96-5a95-2151-8721cabb3107
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.370Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-a4394179-bc26-8f2a-e3b0-644a2ee164dd-group-vector-vector-vector-metrics namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-a4394179-bc26-8f2a-e3b0-644a2ee164dd-group-vector-vector-vector-http namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-32c47a99-999c-8b90-f3a1-0ff282d5365b-group-home-assistant-home-assistant-core-homeassistant_core namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-2e543265-5d8f-7cd1-1370-544e936c3a1a-group-traefik-proxy-nomad-web-web-secure namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-32c47a99-999c-8b90-f3a1-0ff282d5365b-group-home-assistant-home-assistant-sonos-homeassistant_sonos namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-5c12b0e9-9400-2750-0d0d-de0b0d07b691-group-vector-vector-vector-http namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-2e543265-5d8f-7cd1-1370-544e936c3a1a-group-traefik-proxy-traefik-web namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-bf7c2b6e-58d0-57d5-3b4c-1f3d997e70f7-group-moonscope-group-moonscope-web-web-http namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-2b06f23e-2f10-71f9-184a-d541fd0fcb05-group-traefik-proxy-traefik-web namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-bf7c2b6e-58d0-57d5-3b4c-1f3d997e70f7-group-moonscope-group-moonscope-web-api-api namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-2b06f23e-2f10-71f9-184a-d541fd0fcb05-group-traefik-proxy-nomad-web-web-secure namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-29adb06b-4109-b8bf-ddae-17344d65452b-group-traefik-proxy-nomad-web-web-secure namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-5c12b0e9-9400-2750-0d0d-de0b0d07b691-group-vector-vector-vector-metrics namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-22434531-6a4e-1103-37f4-0f302b2b2549-group-traefik-proxy-nomad-web-web-secure namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-29adb06b-4109-b8bf-ddae-17344d65452b-group-traefik-proxy-traefik-web namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-22434531-6a4e-1103-37f4-0f302b2b2549-group-traefik-proxy-traefik-web namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-e9251b24-a970-e7da-897a-998d930d9e62-group-traefik-proxy-nomad-web-web-secure namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-2a8a60f3-3369-20c5-8164-6cab484bc178-group-loki-loki-http-loki-http namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-e2716afc-a920-f3f6-fe66-23ab3fc22e88-group-loki-loki-http-loki-http namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-e9251b24-a970-e7da-897a-998d930d9e62-group-traefik-proxy-traefik-web namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-d4afb853-4e30-b796-2aeb-fba3177ed181-group-loki-loki-http-loki-http namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-47a144a5-655f-35a7-7f18-cbd2baefc291-group-traefik-proxy-nomad-web-web-secure namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-47a144a5-655f-35a7-7f18-cbd2baefc291-group-traefik-proxy-traefik-web namespace=default
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-e1f3e9a5-8c47-97ce-93de-6ea537493d94-group-vector-vector-vector-metrics namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] nomad.fsm: DeleteServiceRegistrationByID failed: error="service registration not found"
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-e1f3e9a5-8c47-97ce-93de-6ea537493d94-group-vector-vector-vector-http namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-3d60140a-f852-000b-4fd8-3844dfaade68-group-vector-vector-vector-http namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [ERROR] client.rpc: error performing RPC to server which is not safe to automatically retry: error="rpc error: service registration not found" rpc=ServiceRegistration.DeleteByID server=192.168.0.3:4647
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.371Z [INFO]  client.service_registration.nomad: attempted to delete non-existent service registration: service_id=_nomad-task-3d60140a-f852-000b-4fd8-3844dfaade68-group-vector-vector-vector-metrics namespace=msl
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.415Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=92e0c7285fa04f4b35bd0e177c774fb777b0e1c517f5c47922032ea4ba17e002
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.417Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector type=Received msg="Task received by client" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.433Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector type="Task Setup" msg="Building Task Directory" failed=false
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.469Z [INFO]  agent: (runner) creating new runner (dry: false, once: false)
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.469Z [INFO]  agent: (runner) creating watcher
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.470Z [INFO]  agent: (runner) starting
Mar 11 21:45:50 shinkandia nomad[34194]:     2024-03-11T21:45:50.496Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=80742d513a824b9a44762de32751f5b3d13b0c36abcaed44393bb31f61a1b661
Mar 11 21:45:55 shinkandia nomad[34194]:     2024-03-11T21:45:55.435Z [INFO]  client: node registration complete
Mar 11 21:45:56 shinkandia nomad[34194]:     2024-03-11T21:45:56.526Z [ERROR] client.driver_mgr.docker: failed to start container: driver=docker container_id=92e0c7285fa04f4b35bd0e177c774fb777b0e1c517f5c47922032ea4ba17e002 error="API error (500): error while creating mount source path '/opt/nomad/data/alloc/1a50e773-4dde-02b3-32f6-16645c7c2f21/cadvisor/local': mkdir /opt/nomad: read-only file system"
Mar 11 21:45:56 shinkandia nomad[34194]:     2024-03-11T21:45:56.538Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor type="Driver Failure" msg="Failed to start container 92e0c7285fa04f4b35bd0e177c774fb777b0e1c517f5c47922032ea4ba17e002: API error (500): error while creating mount source path '/opt/nomad/data/alloc/1a50e773-4dde-02b3-32f6-16645c7c2f21/cadvisor/local': mkdir /opt/nomad: read-only file system" failed=false
Mar 11 21:45:56 shinkandia nomad[34194]:     2024-03-11T21:45:56.540Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor error="Failed to start container 92e0c7285fa04f4b35bd0e177c774fb777b0e1c517f5c47922032ea4ba17e002: API error (500): error while creating mount source path '/opt/nomad/data/alloc/1a50e773-4dde-02b3-32f6-16645c7c2f21/cadvisor/local': mkdir /opt/nomad: read-only file system"
Mar 11 21:45:56 shinkandia nomad[34194]:     2024-03-11T21:45:56.540Z [INFO]  client.alloc_runner.task_runner: restarting task: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor reason="Restart within policy" delay=20.446831592s
Mar 11 21:45:56 shinkandia nomad[34194]:     2024-03-11T21:45:56.540Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor type=Restarting msg="Task restarting in 20.446831592s" failed=false
Mar 11 21:45:57 shinkandia nomad[34194]:     2024-03-11T21:45:57.690Z [ERROR] client.driver_mgr.docker: failed to start container: driver=docker container_id=80742d513a824b9a44762de32751f5b3d13b0c36abcaed44393bb31f61a1b661 error="API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system"
Mar 11 21:45:57 shinkandia nomad[34194]:     2024-03-11T21:45:57.705Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector type="Driver Failure" msg="Failed to start container 80742d513a824b9a44762de32751f5b3d13b0c36abcaed44393bb31f61a1b661: API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system" failed=false
Mar 11 21:45:57 shinkandia nomad[34194]:     2024-03-11T21:45:57.706Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector error="Failed to start container 80742d513a824b9a44762de32751f5b3d13b0c36abcaed44393bb31f61a1b661: API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system"
Mar 11 21:45:57 shinkandia nomad[34194]:     2024-03-11T21:45:57.706Z [INFO]  client.alloc_runner.task_runner: restarting task: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector reason="Restart within policy" delay=16.696831592s
Mar 11 21:45:57 shinkandia nomad[34194]:     2024-03-11T21:45:57.706Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector type=Restarting msg="Task restarting in 16.696831592s" failed=false
Mar 11 21:46:14 shinkandia nomad[34194]:     2024-03-11T21:46:14.448Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=58232b4999a7bac46f3094ac24aca175145f95a4959e274c35ac3dc2073b1514
Mar 11 21:46:17 shinkandia nomad[34194]:     2024-03-11T21:46:17.001Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=f50d7120327fa4592fae1114e4ccf1534054842d2e575dc42707866a05e1d60e
Mar 11 21:46:21 shinkandia nomad[34194]:     2024-03-11T21:46:21.401Z [ERROR] client.driver_mgr.docker: failed to start container: driver=docker container_id=58232b4999a7bac46f3094ac24aca175145f95a4959e274c35ac3dc2073b1514 error="API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system"
Mar 11 21:46:21 shinkandia nomad[34194]:     2024-03-11T21:46:21.412Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector type="Driver Failure" msg="Failed to start container 58232b4999a7bac46f3094ac24aca175145f95a4959e274c35ac3dc2073b1514: API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system" failed=false
Mar 11 21:46:21 shinkandia nomad[34194]:     2024-03-11T21:46:21.413Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector error="Failed to start container 58232b4999a7bac46f3094ac24aca175145f95a4959e274c35ac3dc2073b1514: API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system"
Mar 11 21:46:21 shinkandia nomad[34194]:     2024-03-11T21:46:21.413Z [INFO]  client.alloc_runner.task_runner: restarting task: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector reason="Restart within policy" delay=17.263446488s
Mar 11 21:46:21 shinkandia nomad[34194]:     2024-03-11T21:46:21.413Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector type=Restarting msg="Task restarting in 17.263446488s" failed=false
Mar 11 21:46:22 shinkandia nomad[34194]:     2024-03-11T21:46:22.731Z [ERROR] client.driver_mgr.docker: failed to start container: driver=docker container_id=f50d7120327fa4592fae1114e4ccf1534054842d2e575dc42707866a05e1d60e error="API error (500): error while creating mount source path '/var/lib/docker': mkdir /var/lib/docker: read-only file system"
Mar 11 21:46:22 shinkandia nomad[34194]:     2024-03-11T21:46:22.744Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor type="Driver Failure" msg="Failed to start container f50d7120327fa4592fae1114e4ccf1534054842d2e575dc42707866a05e1d60e: API error (500): error while creating mount source path '/var/lib/docker': mkdir /var/lib/docker: read-only file system" failed=false
Mar 11 21:46:22 shinkandia nomad[34194]:     2024-03-11T21:46:22.745Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor error="Failed to start container f50d7120327fa4592fae1114e4ccf1534054842d2e575dc42707866a05e1d60e: API error (500): error while creating mount source path '/var/lib/docker': mkdir /var/lib/docker: read-only file system"
Mar 11 21:46:22 shinkandia nomad[34194]:     2024-03-11T21:46:22.745Z [INFO]  client.alloc_runner.task_runner: restarting task: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor reason="Restart within policy" delay=24.763446488s
Mar 11 21:46:22 shinkandia nomad[34194]:     2024-03-11T21:46:22.745Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor type=Restarting msg="Task restarting in 24.763446488s" failed=false
Mar 11 21:46:38 shinkandia nomad[34194]:     2024-03-11T21:46:38.710Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=385ad6d4ae8ca5cf53e6ac90a2f8977dd37c4c3fecf801475af4690157636dc3
Mar 11 21:46:45 shinkandia nomad[34194]:     2024-03-11T21:46:45.277Z [ERROR] client.driver_mgr.docker: failed to start container: driver=docker container_id=385ad6d4ae8ca5cf53e6ac90a2f8977dd37c4c3fecf801475af4690157636dc3 error="API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system"
Mar 11 21:46:45 shinkandia nomad[34194]:     2024-03-11T21:46:45.288Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector type="Driver Failure" msg="Failed to start container 385ad6d4ae8ca5cf53e6ac90a2f8977dd37c4c3fecf801475af4690157636dc3: API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system" failed=false
Mar 11 21:46:45 shinkandia nomad[34194]:     2024-03-11T21:46:45.289Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector error="Failed to start container 385ad6d4ae8ca5cf53e6ac90a2f8977dd37c4c3fecf801475af4690157636dc3: API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system"
Mar 11 21:46:45 shinkandia nomad[34194]:     2024-03-11T21:46:45.289Z [INFO]  client.alloc_runner.task_runner: not restarting task: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector reason="Exceeded allowed attempts 2 in interval 30m0s and mode is \"fail\""
Mar 11 21:46:45 shinkandia nomad[34194]:     2024-03-11T21:46:45.289Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector type="Not Restarting" msg="Exceeded allowed attempts 2 in interval 30m0s and mode is \"fail\"" failed=true
Mar 11 21:46:47 shinkandia nomad[34194]:     2024-03-11T21:46:47.552Z [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=1e274013cb079222b390cfed7dc2ab60f0f9b58172894e5f6643e108294bfdf1
Mar 11 21:46:49 shinkandia nomad[34194]:     2024-03-11T21:46:49.295Z [WARN]  client.alloc_runner.task_runner.task_hook.logmon.nomad: timed out waiting for read-side of process output pipe to close: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector @module=logmon timestamp=2024-03-11T21:46:49.295Z
Mar 11 21:46:49 shinkandia nomad[34194]:     2024-03-11T21:46:49.295Z [WARN]  client.alloc_runner.task_runner.task_hook.logmon.nomad: timed out waiting for read-side of process output pipe to close: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector @module=logmon timestamp=2024-03-11T21:46:49.295Z
Mar 11 21:46:49 shinkandia nomad[34194]:     2024-03-11T21:46:49.301Z [INFO]  client.alloc_runner.task_runner.task_hook.logmon: plugin process exited: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b task=vector plugin=/usr/bin/nomad id=34706
Mar 11 21:46:49 shinkandia nomad[34194]:     2024-03-11T21:46:49.302Z [INFO]  agent: (runner) stopping
Mar 11 21:46:49 shinkandia nomad[34194]:     2024-03-11T21:46:49.302Z [INFO]  client.gc: marking allocation for GC: alloc_id=ba67fcb1-8da8-073c-ccbb-16570a00e17b
Mar 11 21:46:49 shinkandia nomad[34194]:     2024-03-11T21:46:49.302Z [INFO]  agent: (runner) received finish
Mar 11 21:46:52 shinkandia nomad[34194]:     2024-03-11T21:46:52.882Z [ERROR] client.driver_mgr.docker: failed to start container: driver=docker container_id=1e274013cb079222b390cfed7dc2ab60f0f9b58172894e5f6643e108294bfdf1 error="API error (500): error while creating mount source path '/opt/nomad/data/alloc/1a50e773-4dde-02b3-32f6-16645c7c2f21/alloc': mkdir /opt/nomad: read-only file system"
Mar 11 21:46:52 shinkandia nomad[34194]:     2024-03-11T21:46:52.893Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor type="Driver Failure" msg="Failed to start container 1e274013cb079222b390cfed7dc2ab60f0f9b58172894e5f6643e108294bfdf1: API error (500): error while creating mount source path '/opt/nomad/data/alloc/1a50e773-4dde-02b3-32f6-16645c7c2f21/alloc': mkdir /opt/nomad: read-only file system" failed=false
Mar 11 21:46:52 shinkandia nomad[34194]:     2024-03-11T21:46:52.894Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor error="Failed to start container 1e274013cb079222b390cfed7dc2ab60f0f9b58172894e5f6643e108294bfdf1: API error (500): error while creating mount source path '/opt/nomad/data/alloc/1a50e773-4dde-02b3-32f6-16645c7c2f21/alloc': mkdir /opt/nomad: read-only file system"
Mar 11 21:46:52 shinkandia nomad[34194]:     2024-03-11T21:46:52.894Z [INFO]  client.alloc_runner.task_runner: restarting task: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor reason="Restart within policy" delay=24.7775437s
Mar 11 21:46:52 shinkandia nomad[34194]:     2024-03-11T21:46:52.894Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1a50e773-4dde-02b3-32f6-16645c7c2f21 task=cadvisor type=Restarting msg="Task restarting in 24.7775437s" failed=false

Nomad Client logs (if appropriate)

lgfa29 commented 6 months ago

Hi @crystalin 👋

Thanks for the report!

Were the client draining when the shutdown happened? The status description Desired Description = alloc is being migrated may indicate so, but I wanted to double check that with you.

crystalin commented 6 months ago

This happened multiple times. Sometime when I did a drain before (but waited for it to end before powering down) but most of the time it wasn't a drain, simply a power cut.

lgfa29 commented 6 months ago

Ah ok, thanks for the extra info.

And if you do a docker ps do you see the containers listed?

crystalin commented 6 months ago

No, the process was not running. The way I see it is:

lgfa29 commented 6 months ago

Thank you for the confirmation.

Under normal circumstances a Nomad client will attempt to reattach to any running process it spawned earlier and resume managing its lifecycle.

But I noticed that your Nomad agent is running as both client and server, which may have caused a situation where the recovery process did not complete successfully.

The errors that read API error (500): error while creating mount source path '/opt/nomad/data/alloc/ba67fcb1-8da8-073c-ccbb-16570a00e17b/alloc': mkdir /opt/nomad: read-only file system are also a little weird.

Is /opt/nomad mounting an external volume, or misconfigured somehow?

crystalin commented 6 months ago

This error happened because I had 2 docker version installed and at reboot it it started both. This might have trigger the issue this time of not allowing to re-attach the allocation

lgfa29 commented 6 months ago

Oh that's interesting. I can see how connecting to a different Docker daemon could cause problems on reboot.

Would you be able to uninstall one of the versions and check if another reboot causes the problem again?

crystalin commented 6 months ago

I'll try that. Do you know a good way to clean the orphan allocations? Right now I have to manually open the db and delete the allocation bucket

crystalin commented 5 months ago

The issue is still happening. This time it wasn't without draining, just restarting the server with a shutdown -P. Additionally, it happens also on the server that is not running the server. (I restarted both but I see double allocation for a service on the 2nd client)

What is the best way to power down and up the machine without having those ?