An implementation of a go (baduk) AI, following methods published for AlphaGo Zero. The primary purpose is to gain hands-on experience with machine learning.
C_puct needs to be tuned since it depends on a lot of other implementation details, and it's not given in the alphago zero paper.
I'm not sure if we're using the same momentum equations as alphago zero. Depending on which momentum equations are used for gradient descent, we may need to change the learning rate: https://www.youtube.com/watch?v=k8fTYJPd3_I
I saw somewhere that using a small positive value to initialize any bias that comes directly before relu might be a good idea. currently just using zero. not sure if this is necessary.
C_puct needs to be tuned since it depends on a lot of other implementation details, and it's not given in the alphago zero paper.
I'm not sure if we're using the same momentum equations as alphago zero. Depending on which momentum equations are used for gradient descent, we may need to change the learning rate: https://www.youtube.com/watch?v=k8fTYJPd3_I