junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
MIT License
3.3k stars 964 forks source link

Keras support #18

Closed MingxuZhang closed 6 years ago

MingxuZhang commented 6 years ago

Dear Junxiao: Thanks very much for sharing your codes on Github. I have learnt a lot from your work.

I am a fan of deep reinforcement learning and I'm still learning it. And your work helps me understand the mechanism of AlphaZero. Meanwhile, I am studying Keras, so I rewrite the "policy_value_net.py" with Keras. I have tested my codes and they passed the test under Keras 2.0.5 with tensorflow-gpu 1.2.1 as backend.

I really hope that I can contribute myself to this project. Honestly wish you can accept this Pull Request. I'm looking forword to your reply.

Yours, Mingxu Zhang

junxiaosong commented 6 years ago

Thank you for your contribution to the project. I will definitely merge this pull request, but please allow me some time to review the changes.

junxiaosong commented 6 years ago

Hi,Mingxu,I find you set use_bias=False for all Conv2D layers. I know it is usually set this way when batch-normalization is used. But in our case, we did not use BN. Is there any reason to ignore the bias terms here?

MingxuZhang commented 6 years ago

Oh, it's my fault. All Conv2D layers should be corrected to "use_bias=True" (just as the default).
The reason why I made the mistake is that I misunderstood the meaning of "untie_biases" when I checked the reference of "class lasagne.layers.Conv2DLayer(incoming, num_filters, filter_size, stride=(1, 1), pad=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), nonlinearity=lasagne.nonlinearities.rectify, flip_filters=True, convolution=theano.tensor.nnet.conv2d, **kwargs)".

junxiaosong commented 6 years ago

Hi, Mingxu, I have merged your first 4 commits and made some minor changes.