hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.76k stars 1.94k forks source link

Changing affinity or spreads prevents in-place upgrade #6988

Closed michaeldwan closed 4 years ago

michaeldwan commented 4 years ago

Nomad version

0.10.2

Operating system and Environment details

ubuntu + custom firecracker task driver

Issue

0.10.2 stopped doing in-place upgrades when only spreads, affinity, or constraints change. The PR (#6703) and issue (#6334) for the change make sense, though for our use case it’s a regression.

We’re using spread+affinity+counts to help place allocs in regions near traffic and adjusting them as traffic changes. For example, a task group with a count of 100 is spread 50/50 between us-east and us-west. If we changed the spread to 49/51, the old behavior would upgrade 100 allocs in-place and prefer us-west for the next allocation, whereas now 100 allocs are stopped only to start 98 in likely the same place as before.

I’m fairly sure we’re an edge case, but the new behavior doesn’t seem ideal. Would you be open to making this behavior configurable? Are custom scheduler plugins possible?

scalp42 commented 4 years ago

@michaeldwan not an edge case, we have the same usage :)

michaeldwan commented 4 years ago

Do you have any thoughts on this issue? We're considering a downgrade to 0.10.1 or running a fork, but if you're open to a fix I'm happy to help. Thanks!

drewbailey commented 4 years ago

Hey @michaeldwan sorry for the delay..

The change to make changes to spreads/affinities and constraints not in-place was intentional. Currently on spread/affinity changes, we recompute scores and we don't correlate those with existing allocs so we can't currently just compute the diff and not reschedule running allocs. In 0.10.4 we are removing the penalty for the allocations previous node, that will bias us a bit towards keeping new allocs on the same node, but will unfortunately still be a new alloc.

I think as a future improvement we can treat changes to spreads/affinity similar to count upgrades, so adhering allocs stay as in-place upgrade and only re-balance a few remaining allocs to achieve the goal / change to the job.

Our team plans to discuss alternative placement algorithm design in the near future and we'll keep this ticket updated.

mrkurt commented 4 years ago

@drewbailey This seems like a pretty big philosophical regression. One of the things that makes nomad easy to work with is its bias toward not disrupting jobs.

drewbailey commented 4 years ago

Hi All,

After discussing with the team, we have decided not to revert this functionality for 0.10.4. We understand that in its current form changes to spreads/affinities causing all allocations to be rescheduled is not ideal. Prior to #6703 changes to spreads/affinities completely ignored running allocations, which was incorrect behavior. Ideally Nomad's scheduling would take into account that a certain amount of allocations were already satisfied by a change to spread/affinity, and make the most minimal changes to running allocations necessary in order to be correct and complete. This is something the team is currently investigating and researching for a future release.

We understand that this leaves a very valid use case in the not so great position of needing to have all allocations rescheduled. In the meantime would it be possible to manually spread the east/west groups by using two separate jobs? This is far from ideal but should allow you to tune the count of each job and not reschedule all allocations when adjusting small numbers.

We would love to hear from the community about what changes to affinities and spreads you would like to see in a future release as we continue to think of an ideal solution for Nomad. Please comment below with your ideas and use cases.

stale[bot] commented 4 years ago

Hey there

Since this issue hasn't had any activity in a while - we're going to automatically close it in 30 days. If you're still seeing this issue with the latest version of Nomad, please respond here and we'll keep this open and take another look at this.

Thanks!

stale[bot] commented 4 years ago

This issue will be auto-closed because there hasn't been any activity for a few months. Feel free to open a new one if you still experience this problem :+1:

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.