shindavid / AlphaZeroArcade

8 stars 1 forks source link

Near-Uniform Random Openings #44

Open shindavid opened 1 year ago

shindavid commented 1 year ago

Implement the following idea from Appendix D of the KataGo paper:

In 5% of games, the game is branched after the first r turns where r is drawn from an exponential distribution with mean 0.025 ∗ b^2. Between 3 and 10 moves are chosen uniformly at random, each given a single neural net evaluation, and the best one is played. Komi is adjusted to be fair. The game is then played to completion as normal. This ensures that there is always a small percentage of games with highly unusual openings.

Some thought is needed on how to generalize this for games besides go. The komi-adjustment in particular has no clear analog in other games. It might be the case that there is no good way to generalize this.

shindavid commented 1 year ago

Note: it is likely that this comment from the KataGo paper applies to this idea:

Except for introducing a minimum necessary amount of entropy, the above settings very likely have only a limited effect on overall learning efficiency and strength. They were used primarily so that KataGo would have experience with alternate rules, komi values, handicap openings, and positions where both sides have played highly suboptimally in ways that would never normally occur in high-level play, making it more effective as a tool for human amateur game analysis.