We want to introduce test load balancing to better parallelise and speed up our CI/CD tests. Part of this task is determining the most efficient number of shards to use, since we want to have the lowest number of parallel runners with the highest rate test completion. Such a task falls under the scope of a bin packing problem, which is NP-hard, however there are approximations we can use to make the problem easier.
On top of simply choosing the best number of shards to use, we want the shards themselves to be evenly balanced so that every test runner finishes at approximately the same time. We can use cached timing information from previous test runs to help to make this decision.
https://github.com/kamilkisiela/split-tests - Very simple algorithm for keeping shards approximately the same size (no calculation for the most efficient number of shards though)
Specification
We want to introduce test load balancing to better parallelise and speed up our CI/CD tests. Part of this task is determining the most efficient number of shards to use, since we want to have the lowest number of parallel runners with the highest rate test completion. Such a task falls under the scope of a bin packing problem, which is NP-hard, however there are approximations we can use to make the problem easier.
On top of simply choosing the best number of shards to use, we want the shards themselves to be evenly balanced so that every test runner finishes at approximately the same time. We can use cached timing information from previous test runs to help to make this decision.
Additional context
Tasks