大佬您好，我仿照您的代码写了一个 repo，但是训练后总是输给纯蒙特卡洛树 AI

junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

MIT License

3.23k stars 962 forks source link

Closed cloxnu closed 4 years ago

cloxnu commented 4 years ago

我使用 keras 训练。这里是我的 repo：https://github.com/CLOXnu/Omega_Gomoku_AI

我发现训练后的 AI 在 6_6_4 的 board 总是下了连着的三颗子后不会去下第四颗子胜利，而是去下其他地方，在训练评估阶段总是输给 1000 次搜索的纯 MCTS 的 AI。

大佬们能帮忙看看么，万分感谢🙏

cloxnu commented 4 years ago

解决啦解决啦，是神经网络棋盘输入的问题好，哈哈哈～