Add Celery Distributed Computing to Long Running Processes

Round Conclusion (tiebreakers, scorekeeping) and Matchmaking are the two most processing-intensive operations.

During a large tournament, or on a busy day, these methods could take upwards of minutes, leaving the user staring at a spinning disk as they wait for the round to be generated.

There are a couple of ways to handle this. Making the call Asynchronous, using a Javascript call to kick off the "Conclude Round", and updating the GUI once the endpoint returned, is a preferred method. To do that, however, requires a redesign of the view template. We will get to that before too long, but this is another way.

By adding distributed processes in the background to handle this, we can offload the scorekeeping to multiple workers, distributing the workload out.

The first commit in this PR, the Round Conclusion portion, is the first pass ad adding distributed computing. Previously, with a 150 person tournament, at round 12, it took about 4 minutes to compute the Player Scores, Strength of Schedule, and Extended Strength of Schedule.

After this commit, it now takes between 20-25, based on 2 workers with 4 forks each (8 effective workers).

The cluster that I would eventually propose we deploy this to, a low-power wide compute cluster, would be able to run 8 workers with 4-8 forks each, giving us an effective 32-64 workers.

Given that the test simulation required 300 unique tasks to complete (150 Player Score Calculations, and 150 Strength of Schedule calculations), the compute workload could be very quickly dissected for even the largest of tournaments, at very minimal cost and delay.

Chemscribbler / aesopstables

Add Celery Distributed Computing to Long Running Processes #18