How is distributed training implemented in KataGo?

lightvector / KataGo

GTP engine and self-play learning in Go

Other

3.33k stars 546 forks source link

Are you familiar with AlphaZero, or Expert Iteration, or similar methods? The original papers are pretty good background reading if you're not: https://arxiv.org/pdf/1705.08439.pdf https://discovery.ucl.ac.uk/id/eprint/10045895/1/agz_unformatted_nature.pdf

The main thing about these methods is that almost all the compute cost is from the self-play portion, using search as the policy improvement mechanism. So that's the part that needs to be distributed. For training the model using that data, you don't need as much compute power, right now a single machine using only one (strong) GPU is enough to keep up.

lightvector / KataGo

How is distributed training implemented in KataGo? #915