It's great that AlphaGo Zero learns only through self-play without readily available human-playing data since that data is not available for most real world problems. 👍
AlphaGo Zero has only one neural network to do both move and evaluation through multi-task learning and simplified Monte Carlo Tree Search. 👍
AlphaGo Zero defeated Alpha Go after only 3 days of training on 4 TPUs. The first version of Alpha Go requires 176 GPUs. 👍
Nonetheless, AlphaGo Zero is still a narrow AI system that can hardly be generalized beyond Go, where there are no clear winners or losers. 👎
Without human guidance, AlphaGo Zero learns Go strategies more randomly (not from easy to hard) and makes unseen moves by Go players. 🤔
Article
Notes