Our central NN architecture should be based off of AlphaZero but maybe we can leave some things flexible.
For instance, we could have a param for number of residual blocks; a parameter for what game to train/test on; a param for batch norm on/off (or maybe other norm types); and a param for which nonlinearity to use (ReLU to start with, like in the paper maybe?).
Our central NN architecture should be based off of AlphaZero but maybe we can leave some things flexible.
For instance, we could have a param for number of residual blocks; a parameter for what game to train/test on; a param for batch norm on/off (or maybe other norm types); and a param for which nonlinearity to use (ReLU to start with, like in the paper maybe?).