framed-data / overseer

Overseer is a library for building and running data pipelines in Clojure.
Eclipse Public License 1.0
97 stars 10 forks source link

First pass at basic "lottery-based" job selection #39

Closed andrewberls closed 9 years ago

andrewberls commented 9 years ago

We have been seeing workers start very long (hour or multi-hour) jobs, and then have other workers randomly select the same job to start, since started jobs are fair game as part of our resiliency strategy. Thus we want workers to prefer unstarted jobs, without imposing an absolute priority/sort ordering and running the risk of starving out the started jobs.

This introduces a very basic lottery selection algorithm, where each job gets some number of tickets proportional to its status (unstarted jobs get more tickets than started jobs), and the winner is chosen at random, meaning unstarted jobs are more likely to be chosen. The ticket ratios for unstarted/started/_ are currently 3:2:1, chosen arbitrarily, and open to tuning.

itsthomson commented 9 years ago

lotto

elliot42 commented 9 years ago

:+1: okay that's pretty well played sir