axiomizer / haida-go

An implementation of a go (baduk) AI, following methods published for AlphaGo Zero. The primary purpose is to gain hands-on experience with machine learning.
0 stars 0 forks source link

Tune hyperparameters #8

Open axiomizer opened 1 year ago

axiomizer commented 1 year ago

C_puct needs to be tuned since it depends on a lot of other implementation details, and it's not given in the alphago zero paper.

I'm not sure if we're using the same momentum equations as alphago zero. Depending on which momentum equations are used for gradient descent, we may need to change the learning rate: https://www.youtube.com/watch?v=k8fTYJPd3_I

axiomizer commented 1 year ago

I saw somewhere that using a small positive value to initialize any bias that comes directly before relu might be a good idea. currently just using zero. not sure if this is necessary.