junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
MIT License
3.29k stars 964 forks source link

这个公式是不是写错了 #56

Open tianv opened 6 years ago

tianv commented 6 years ago

self._u = (c_puct self._P np.sqrt(self._parent._n_visits) / (1 + self._n_visits)) 是不是要改成下面的: self._u = (c_puct self._P np.sqrt(self._parent._n_visits / (1 + self._n_visits)))

self._parent._n_visits后面的括号改到后面去

junxiaosong commented 6 years ago

这个公式没问题的,只对self._parent._n_visits开根号,不确定的话可以参见AlphaGo Zero论文