hashicorp / nomad-driver-podman

A nomad task driver plugin for sandboxing workloads in podman containers
https://developer.hashicorp.com/nomad/plugins/drivers/podman
Mozilla Public License 2.0
224 stars 61 forks source link

deps: update nomad dependency to 1.7.3 #309

Closed shoenig closed 5 months ago

shoenig commented 6 months ago

Requires a bit of refactoring due to the use of non-api packages that got refactored in Nomad itself.

Closes #308 Closes #287

zyclonite commented 6 months ago

this should as well fix an incompatibility for ARM64/AARCH64 deployments - since 1.7.x deployments on arm fails because of a mismatch of cpu architecture

shoenig commented 6 months ago

Can a task created in a previous version still be recovered with these changes?

Oh good catch; it works as long as the Nomad agent is 1.7+. Otherwise, retrieving stats about the old alloc produces an error. TBH I'm not sure what our upgrade guidance is in a case like this ... probably just tag the driver as v0.6.0 and note that it requires Nomad 1.7?

➜ nomad alloc status 04
ID                  = 042098f0-2385-5c0d-a18c-01c3bd22e555
Eval ID             = db430a05
Name                = redis.cache[0]
Node ID             = abc08a9e
Node Name           = diablo
Job ID              = redis
Job Version         = 0
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 2m10s ago
Modified            = 1m59s ago
Deployment ID       = 2a367b27
Deployment Health   = healthy

Allocation Addresses:
Label  Dynamic  Address
*db    yes      192.168.88.189:31968 -> 6379

Couldn't retrieve stats: invalid character 'N' looking for beginning of value

Task "redis" is "running"
Task Resources:
CPU      Memory   Disk     Addresses
500 MHz  256 MiB  300 MiB

Task Events:
Started At     = 2023-12-15T13:40:15Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2023-12-15T13:40:15Z  Started     Task started by client
2023-12-15T13:40:15Z  Task Setup  Building Task Directory
2023-12-15T13:40:15Z  Received    Task received by client
shoenig commented 6 months ago

@zyclonite can you post the exact error you're seeing (but in a separate GH issue). Even if it gets fixed with this update I'd like to better understand what happened.

zyclonite commented 6 months ago

@shoenig had now time to investigate a bit more and created https://github.com/hashicorp/nomad-driver-podman/issues/310 (there is a chance that this is a general nomad issue)

lgfa29 commented 6 months ago

Oh good catch; it works as long as the Nomad agent is 1.7+. Otherwise, retrieving stats about the old alloc produces an error. TBH I'm not sure what our upgrade guidance is in a case like this ... probably just tag the driver as v0.6.0 and note that it requires Nomad 1.7?

I tried thinking of something, but I can't find another way around. I'm not even sure where the error is coming from 🤔

The task CpuStats is just a bunch of float64 right? https://github.com/hashicorp/nomad/blob/f18d5c7c326de9904a1e1d9fecdc3c3a84cb965c/client/structs/structs.go#L263-L274

Maybe worth raising with the rest of the team to get some brainstorm going? 😄

shoenig commented 5 months ago

Looked into the alloc status -stats failure - it only happens when using Nomad 1.6.x with Podman+Nomad1.7 as a dependency. Basically we just need to note Nomad and the driver should be upgraded together. We'll make this v0.6.0 just to emphasize that fact.

Will also bump to Nomad 1.7.3