cloudfoundry / diego-notes

Diego Notes
Apache License 2.0
23 stars 7 forks source link

How to choose which Tasks/LRPs to schedule when capacity is low? #3

Closed Amit-PivotalLabs closed 9 years ago

Amit-PivotalLabs commented 9 years ago

The Auctioneer receives a batch of LRPs and Tasks to start. It gives priority to LRPs over Tasks. Amongst LRPs, it sorts by requested memory, gives priority in order of most memory required (boulders vs. grains). It makes no distinction amongst tasks.

This raises a potential problem: if a large memory app requests a large number of instances, and a small memory app requests just 1, that 1 instance app might not get run if capacity is low, whereas several redundant instances of the other app may be running. We could instead rank LRPs and Tasks in the following hierarchy:

And within each tier of the hierarchy, we can sort by memory required.

At any rate, we should document more thoroughly how we rank pending work to be auctioned off, and why.

Tracker story reflecting this suggestion here.

Amit-PivotalLabs commented 9 years ago

It would also be nice to implement things such that the sorting strategy is pluggable, that way we can run simulations with different sorting orders and scoring algorithms, and run simulations.

emalm commented 9 years ago

Both excellent suggestions. I'm excited about the idea of plugging in different sorting and scoring strategies. Also, if tasks are too low a priority for the system, we could end up being unable to stage new apps in favor of running redundant instances of existing apps. That would seem especially unfortunate if the restage is to decrease the amount of memory to allocate to an app's instances, since we have to restage when that changes.

fraenkel commented 9 years ago

I think we should take a lesson from HM9000 here. HM9000 employs a dual strategy for prioritization, time & percent running. It seems to work well and balance what really needs to start. As you have more and more equal weighted LRPs, time needs to be an overriding factor. We can get more complicated with the weighting.

onsi commented 9 years ago

Also: this is prioritization within a batch in the auctioneer. There are no stong guarantees about what will be in that batch. If we want to put all the relative prioritization logic in this batch then we'll want the converger to emit one big bulk "start all these things" requests instead of what it does today (emit individual starts). This will also be vastly more performant.

onsi commented 9 years ago

This has been implemented in

https://www.pivotaltracker.com/story/show/84173326