First, great implementation, this really helped me understand how AlphaGo Zero works, and I’ve used it for other games as well.
There’s is still one thing I couldn’t understand though. Why do you return -v instead of v when you do the Monte Carlo tree search?
First, great implementation, this really helped me understand how AlphaGo Zero works, and I’ve used it for other games as well. There’s is still one thing I couldn’t understand though. Why do you return -v instead of v when you do the Monte Carlo tree search?