一个关于对局数据的问题

junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

MIT License

3.25k stars 965 forks source link

Closed GeneZC closed 6 years ago

GeneZC commented 6 years ago

如果训练的是一个先手有利的棋类（如无禁手的五子棋），那训练数据会是大量的黑棋胜的数据，这样会对训练结果造成什么样的影响呢？

GeneZC commented 6 years ago

从目前的结果来看，是白棋的棋力较弱且对当前局面的估值基本为负且靠近-1

GeneZC commented 6 years ago

经过后续的长期训练发现，算法可能能够规避先手有优势这种问题