How to tell if value overfit

Akababa / Chess-Zero

Chess reinforcement learning by AlphaZero methods.

MIT License

38 stars 14 forks source link

Closed Akababa closed 6 years ago

Akababa commented 6 years ago

Self play keeps resigning
If you assume most "normal" chess games are pretty even until halfway through, you get a lower bound of 0.5 on asymptotic MSE. (I think) this is decreasing in the average elo of the players and also the elo difference between the players.
AZ had 5000 TPUs running self play and only 64 running SGD

Akababa commented 6 years ago