junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
MIT License
3.23k stars 962 forks source link

Asymmetrical game board can't be trainned 棋盘宽和长不等时,如何训练? #134

Open zhannahz opened 9 months ago

zhannahz commented 9 months ago

Hi. I am a student working on a research regarding decision and prediction during a game. Due to the experimental setting, we'd like to change the board to an aymmetrical one, like width=9 and height=4.
I modified the part where you flip and rotate the board in train.py, so that it only flips horizontally and vertically.

# Flip horizontally
equi_state_h = np.array([np.fliplr(s) for s in state])
equi_mcts_prob_h = np.fliplr(np.flipud(mcts_prob.reshape(self.board_height, self.board_width)))
extend_data.append((equi_state_h, np.flipud(equi_mcts_prob_h).flatten(), winner))

# Flip vertically
equi_state_v = np.array([np.flipud(s) for s in state])
equi_mcts_prob_v = np.flipud(np.flipud(mcts_prob.reshape(self.board_height, self.board_width))) extend_data.append((equi_state_v, np.flipud(equi_mcts_prob_v).flatten(), winner))

# Flip both horizontally and vertically
            equi_state_hv = np.array([np.fliplr(s) for s in equi_state_v])
            equi_mcts_prob_hv = np.fliplr(equi_mcts_prob_v)
            extend_data.append((equi_state_hv, np.flipud(equi_mcts_prob_hv).flatten(), winner))

*Not sure if this is the right way. You seemed to use np.flipud(equi_mcts_prob).flatten() to kind of hack to get the probabilities reshaped properly into the board dimensions.

But even that, when I train, the model never go to a very good win rate (with game batch num=3000).