Random flip and rotation when evaluate

Hi @apollo-time

It is written in DeepMind's paper

Expand and evaluate (Fig. 2b). The leaf node sL is added to a queue for neural net-work evaluation, (di(p), v) = fθ(di(sL)), where di is a dihedral reflection or rotation selected uniformly at random from i in [1..8].

and their new paper

The rules of Go are invariant to rotation and reflection. This fact was exploited in AlphaGo and AlphaGo Zero in two ways. First, training data was augmented by generating 8 symmetries for each position. Second, during MCTS, board positions were transformed using a randomly selected rotation or reflection before being evaluated by the neural network, so that the MonteCarlo evaluation is averaged over different biases

mokemokechicken / reversi-alpha-zero

Random flip and rotation when evaluate #16