Open linluxiang opened 2 years ago
Thanks for the idea @linluxiang.
I think it's a very interesting concept, but, given how Nomad internals work today, it may tricky to implement.
Yes please, an example of where bin-packing becomes a problem is IP rate-limiting on API requests or limited network throughput on clients. Workarounds with spread stanza to target multiple datacenters or show favoritism towards a specific client can still be problematic with numerous jobs (either because of the manual efforts if job generator not made or still kinda suffering bin-packing).
The Nomad scheduler uses a bin-packing algorithm when making job placements on nodes to optimize resource utilization and density of applications. Although bin packing ensures optimal resource utilization, it can lead to some nodes carrying a majority of allocations for a given job. This can cause cascading failures where the failure of a single node or a single data center can lead to application unavailability.
Source: https://learn.hashicorp.com/tutorials/nomad/spread
Is that the official reason nomad uses bin-packing? Makes sense but thinking about fixed costs and no autoscaling for clients regardless of resource utilization is it still optimal? Probably missing some nuance here.
Hi there,
Proposal
I'd like to have a feature of using different dispatching policies for different sets of servers. For example for nodeclass 1, the allocations are dispatched evenly on all the machines, and for nodeclass 2, the allocations are dispatched using bin-packing algorithm.
This is a description of my use case.
Use-case
I have two datacenters, dc1 and dc2. I’d like the system to work like this:
I created a parameterized job and used this setting of affinity:
The 1st goal seems to be achieved but the the 2nd one doesn’t. I searched in google and found someone said “nomad has an automatic anti-affinity to prevent a node from running too many jobs.”. But when I was using nomad alloc status to check the ranking, the anti-affinity score was always 0, not negative.
I guess anti-affinity is automatically calculated, thus I cannot control it. So to achieve the goals I'd like to request a new feature.
Could you mind taking a look at it please?
Thank you