hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.

https://www.nomadproject.io/

Other

14.85k stars 1.95k forks source link

node drains can result in 0 allocs running during migration #11879

Open roman-vynar opened 2 years ago

roman-vynar commented 2 years ago

Hello,

Let's say we have a service with count=1. When I set the node to drain, Nomad will stop allocs on that node immediately. My single service alloc will be gone. And only afterwards, Nomad will start re-allocating it (migrate phase).

I would expect it should bring a new alloc on the other node and only then kill the old one and not leaving us with count=0. In fact, it does not matter if count is 1 or not. The point here the desired count becomes lower during "migration".

As I understand it's all about a term of oversubscription. I think it's nice to have an option to guarantee the desired count of allocs for "update" and/or "migrate" (drain) phases.

And the last if you can suggest how it can be fixed w/o doing +1 before service update or node drain and then -1 once we are done so the guaranteed count of allocs are satisfied. That's really important.

Thanks!

lgfa29 commented 2 years ago

Hi @roman-vynar, thanks for suggestion. I think it would be a nice enhancement to have.

But is there anything specific from this issue that is not in #8538?

rhuddleston commented 2 years ago

8538 doesn't have an effect of single instances of a service going down to zero like what happens in this drain example

roman-vynar commented 2 years ago

But is there anything specific from this issue that is not in #8538?

@lgfa29 yes, it's a bit different as it's not about a service/task update.

When you run a service with count=1 you can get into the situation when no allocs are running at all (because drain stops it immediately) and it's not even the fact it will do that soon if there are other problems with re-scheduling due to a lack of resources etc. I think when a node is set to drain it should re-allocate the existing allocs first and then kill them.
It's not only about count=1 services. The current draining functionality kills all allocs running on the node at once and you may end up badly when you can't get them back on the new nodes. May be we need a new feature, tell an alloc to re-allocate and this what the drain thing should do?

Any suggestions are welcome. Thanks!

lgfa29 commented 2 years ago

Got it, thanks for extra details @rhuddleston and @roman-vynar!

I think when a node is set to drain it should re-allocate the existing allocs first and then kill them.

I can see some scenarios where you may not want this happen. For example, services that have single-write volume, you can't start the new alloc before stopping the old one.

So I think this may need to be an opt-in behaviour, either at drain time or per group? 🤔

I set this for our community triage process so we can have discuss it a bit more.

Thanks for the idea!

tgross commented 2 years ago

Dropping a note here for us to consider the impact on the ephemeral_disk.migrate for any changes we make on this.