Closed killerducky closed 6 years ago
https://github.com/pytorch/ELF/blob/e3f407226056da9c8a1861cd25e9dbf9dac0d62e/scripts/elfgames/go/start_selfplay.sh#L39 Maybe selfplay uses 1.5 as well.
Some comparisons here: https://www.reddit.com/r/cbaduk/comments/8j5x3w/first_play_urgency_fpu_parameter_in_alpha_zero/dz1ipi7/ 1.5/2=0.75 is close to what LZ uses (0.8).
Hi @killerducky, 1.50 is used everywhere. (the script included in this repository has some of our older training parameter values)
How did you tune the --mcts_puct values? Is it true different values are used for generating self-play games for training vs match play?
I think self-play for training uses --mcts_puct 0.85 https://github.com/pytorch/ELF/blob/113aba73ec0bc9d60bdb00b3c439bc60fecabc89/scripts/elfgames/go/start_client.sh#L17
And match play uses --mcts_puct 1.50 https://github.com/pytorch/ELF/blob/a4edc96e8bf94aa1a84134431ce3758a6ade27c7/README.rst#running-a-go-bot
Edit: BTW I think this is the relevant part of the AGZ paper:
It doesn't really clarify if this tuning is done for self-play only, or something more expensive involving the entire training feedback loop.