hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.85k stars 1.95k forks source link

node drains can result in 0 allocs running during migration #11879

Open roman-vynar opened 2 years ago

roman-vynar commented 2 years ago

Hello,

Let's say we have a service with count=1. When I set the node to drain, Nomad will stop allocs on that node immediately. My single service alloc will be gone. And only afterwards, Nomad will start re-allocating it (migrate phase).

I would expect it should bring a new alloc on the other node and only then kill the old one and not leaving us with count=0. In fact, it does not matter if count is 1 or not. The point here the desired count becomes lower during "migration".

Related https://github.com/hashicorp/nomad/issues/8538

As I understand it's all about a term of oversubscription. I think it's nice to have an option to guarantee the desired count of allocs for "update" and/or "migrate" (drain) phases.

And the last if you can suggest how it can be fixed w/o doing +1 before service update or node drain and then -1 once we are done so the guaranteed count of allocs are satisfied. That's really important.

Thanks!

lgfa29 commented 2 years ago

Hi @roman-vynar, thanks for suggestion. I think it would be a nice enhancement to have.

But is there anything specific from this issue that is not in #8538?

rhuddleston commented 2 years ago

8538 doesn't have an effect of single instances of a service going down to zero like what happens in this drain example

roman-vynar commented 2 years ago

But is there anything specific from this issue that is not in #8538?

@lgfa29 yes, it's a bit different as it's not about a service/task update.

Any suggestions are welcome. Thanks!

lgfa29 commented 2 years ago

Got it, thanks for extra details @rhuddleston and @roman-vynar!

I think when a node is set to drain it should re-allocate the existing allocs first and then kill them.

I can see some scenarios where you may not want this happen. For example, services that have single-write volume, you can't start the new alloc before stopping the old one.

So I think this may need to be an opt-in behaviour, either at drain time or per group? 🤔

I set this for our community triage process so we can have discuss it a bit more.

Thanks for the idea!

tgross commented 2 years ago

Dropping a note here for us to consider the impact on the ephemeral_disk.migrate for any changes we make on this.