Open jamesooo opened 6 months ago
Hi @jamesooo! That job plan
doesn't return the expected error code but job run
works as suspected points to a problem in how we're reporting the diff, rather than a scheduler bug (fortunately!). Going back through the unsupported versions changelog, I find a potential culprit in https://github.com/hashicorp/nomad/pull/14492 where the exit code was changed, but I would not expect the diff type to be None
in the case you've described either.
I'm going to edit the title on this slightly and mark it for further investigation.
Nomad version
Operating system and Environment details
Ubuntu Focal
Issue
It seems that the behavior of
nomad plan
for system jobs has changed between 1.3 and 1.5 so that allocations for system jobs which are stopped individually do not register that a change in allocations is required.Reproduction steps
Start a Nomad job
Stop a single allocation of that job
Confirm allocation is stopped
Note because the job is
type = "system"
no new allocation is started to take the place of the stopped one.Now run plan against the hcl file and observe the exit status
Previous Nomad versions have returned 1 here indicating that running the allocation would place a new allocation, however
run
will still place the new allocationExpected Result
Plan should return the exit code
1
to indicate thatrun
will create an allocationActual Result
Plan instead outputs exit code
0
indicating that no changes are required to meet the jobspecJob file (if appropriate)