Akababa / Chess-Zero

Chess reinforcement learning by AlphaZero methods.
MIT License
38 stars 14 forks source link

How to tell if value overfit #2

Closed Akababa closed 6 years ago

Akababa commented 6 years ago
  1. Self play keeps resigning
  2. If you assume most "normal" chess games are pretty even until halfway through, you get a lower bound of 0.5 on asymptotic MSE. (I think) this is decreasing in the average elo of the players and also the elo difference between the players.
  3. AZ had 5000 TPUs running self play and only 64 running SGD
Akababa commented 6 years ago

https://github.com/Akababa/chess-alpha-zero/wiki/Signs-of-value-overfitting