Goal:
Create locality for tasks of similar type to improve task efficiency and
throughput.
Use cases:
For example tests may require a specific firmware version and switching the
firmware on the device take a non-trivial amount of time. It's better to assign
tasks to the bots representing devices already in a particular state. In this
case, a single string would likely be enough to describe the bot's state to
assign its affinity.
Similarly for isolated testing, when >10k >500mb files need to be mapped, the
latency can become higher than the actual task when the task is not triggered
often and the pool of bots is large. So it's better to trigger these tasks on a
bot that had already run this test. In practice it's too slow for the server to
count the exact hit rate of 10k files on every bot task poll, so an heuristic
has to be used. In this case, the bot could list a few strings describing the
previous tasks that were run, which is an heuristic to implicitly describe the
cache content.
Implementation:
Using 'tags' matching would likely be the fastest implementation; the server
only understand tags and affinity is calculated from the "distance of tags".
Tags will be implemented as part of issue 123. That's the task request part of
tags, for example describing the task name (e.g. browser_tests,
base_unittests). The thing is that only a few select tags will be useful for
affinity and historical may (run_isolated cache hit rate) or may not (firmware
on the device) be useful. The implementation needs to support both use case
efficiently.
In practice, this is tricky to implement at the polling time, because it means
polling for a task may mean not handing the task out to a bot because another
bot is known to be more apt to run the task. The server has to "guess" that the
other more affine bot will poll soon.
Also behavior in 100% utilization needs to be clearly stated, when given an
higher priority task that is not affine but a lower affine priority task, which
one should be selected? It's easy to get the task scheduler overly complicated
in that case and the search space needs to stay linear for DB operation
efficiency reason.
Original issue reported on code.google.com by maruel@chromium.org on 15 Aug 2014 at 4:24
Original issue reported on code.google.com by
maruel@chromium.org
on 15 Aug 2014 at 4:24