Open diptanu opened 8 years ago
Perhaps it could also be valuable to do a node-drain on a list of node ids.
nomad node-drain -enable -yes abcdefg1 abcdefg2 abcdefg3
If you run each one individually, workloads from abcdefg1
may be pushed to abcdefg2
which makes abcdefg2
take longer to clear out its workload.
I just want to be able to drain everything.
nomad node-drain -enable -yes -all
or nomad node-drain -enable -yes *
Operators could drain clusters easily if Nomad allowed specifying certain values which matches metadata while draining. For example, operators could drain all nodes in an AWS ASG, or all nodes which matches the value of Nomad's client's version.
The implementation can be broken down into 2 phases which can be implemented independently:
/v1/node/:nodeid/drain
that accepts multiple Node IDs and updates those nodes atomically in Raft.While many users have made their own solution to
#2
, without#1
being solved it risks races between marking nodes as draining and rescheduling work causing lots of churn.Imagine the hypothetical command:
Without a batching API that could produce the following timeline:
As you can see we may end up rescheduling allocations multiple times. While the cluster always stabilizes there will be an unnecessary amount of work with lots of allocations created and replaced in a short amount of time.
*With a batching API all of the nodes would be atomically updated and allocations would be rescheduled exactly once.