icecc / icecream

Distributed compiler with a central scheduler to share build load
GNU General Public License v2.0
1.58k stars 248 forks source link

Refactor pick_server and add new scheduler algorithms #595

Closed deriamis closed 2 years ago

deriamis commented 2 years ago

The classic Icecream scheduler doesn't work well in all environments, especially when deployed into the cloud. To remedy that, pick_server is refactored, and several new scheduling algorithms are added as options:

The fastest algorithm is the "classic" algorithm.

I've also fixed up a few parts of the fastest algorithm and server selection in general to be a bit more obvious in their function. In particular, the original weight assumed that there were at least 1000 nodes in the cluster. This has been changed to scale along with the size of the cluster.

HenryMiller1 commented 2 years ago

This looks good overall. I'm interested to see how/if these make a difference in our systems.

deriamis commented 2 years ago

I don't see documentation, this new option will be very important for users to play with so it needs to be clearly documented.

Make sure that you note that the "Fastest" may not be the fastest in the real world and users should try all options to see what works best. I'm interested in making a different algorithm the default in the future after we get more experience with them in the real world.

Documentation was my next step after getting some feedback on how it would operate. Given your positive initial response, I'll work on the documentation next.

deriamis commented 2 years ago

@HenryMiller1 I've added the documentation you requested. PTAL. Please note, this PR also has changes from #596 and #597 in it, which were needed for testing, so if you would not mind taking a look at those as well, I would appreciate it. Thanks!

HenryMiller1 commented 2 years ago

Does anyone understand the circle_ci build error? Looks like it is happening everywhere, so probably not this, but I still don't like accepting code that doesn't pass Ci.

deriamis commented 2 years ago

Does anyone understand the circle_ci build error? Looks like it is happening everywhere, so probably not this, but I still don't like accepting code that doesn't pass Ci.

@HenryMiller1 I took a quick look, and from what I can tell it's happening because the version of gm4 installed from pkg is no longer compatible with FreeBSD 12 (which is EOL). I don't know anything beyond that, nor do I have the access to fix it.