如果是可以走棋的游戏 action网络应该怎样设计？

junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

MIT License

3.27k stars 964 forks source link

Closed fupip closed 6 years ago

fupip commented 6 years ago

围棋和五子棋都是放下后不可移动，所以action和evaluation共用了一部分网络如果是象棋跳棋类型，应该怎么设计这个action网络部分呢？能否提供一点思路？

BIGBALLON commented 6 years ago

象棋的话参考第三篇paper "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm." (2017). [arXiv:1712.01815]

将棋和象棋基本上都是暴力展开(稍微有用一点trick减少一点dim) paper后面有很详细的说明用了那些

fupip commented 6 years ago

好的，谢谢。我去看一下。