hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.56k stars 1.92k forks source link

allow node drain to block on batch allocs #1523

Open dvusboy opened 7 years ago

dvusboy commented 7 years ago

It would be useful to support letting batch tasks finish when node-drain is enabled on a node. This can be handled with a new flag to nomad node-drain. When present it would:

olenm commented 7 years ago

Do you mean to have 3 flags for node-drain which would allow the manual triggering of the bullet-points you listed?

dvusboy commented 7 years ago

Not really, I believe node-drain already does all 3 (I assume it does not "relocate" system tasks) actions and also relocate batch tasks. I'd like the flag to cause node-drain to only do these 3 actions, suppressing the relocation of batch tasks.

stongo commented 7 years ago

Is this being worked on? We are very motivated to add this feature but don't want to duplicate any efforts if this is already a work in progress. If not already in progress, we would most likely be willing to dedicate some engineering time to add this feature.

dadgar commented 7 years ago

@stongo No it is not currently being tackled. I would add a string DrainType that could be used to control the behavior of the drain. Then the next place to tackle would be generic_sched.go

endocrimes commented 6 years ago

Hey @dadgar,

I have a WIP patch that implements this feature for our deployment needs - But I'd like some opinions on how best to implement the UI for this in a way that works for the project itself. How would you best want to expose this to users, only expose draining batch jobs, or also allow waiting for services?

dadgar commented 6 years ago

@DanToml Hey! So this is pretty good timing as @schmichael will be working on our next generation node draining strategies as part of Nomad 0.8! I am sure he would love to see your implementation so maybe linking the branch here would be a good start.

endocrimes commented 6 years ago

@dadgar Hey - The configurable version of our original hack patch is here: https://github.com/circleci/nomad/commit/ddb347a2d37751ee38dff2b2342a8a863039a856 - I'm happy to either fix this patch to be something mergeable, and/or contribute towards the 0.8 draining strategy work.

endocrimes commented 6 years ago

Relatedly - I'm not super sure what drain options make sense, outside of batch/(none|all) - I haven't used the services component of nomad, so I'm not sure of their mechanics.