donkirkby / zero-play

Teach a computer to play any game.
https://donkirkby.github.io/zero-play/
MIT License
10 stars 1 forks source link

Neural networks #36

Open donkirkby opened 3 years ago

donkirkby commented 3 years ago

Bring back support for a neural network heuristic.

donkirkby commented 9 months ago

This article on regression looks helpful.

donkirkby commented 9 months ago

Now having memory problems. See if this article on memory leaks is helpful.

donkirkby commented 8 months ago

See if Maia Chess is a useful guide for how to set up a neural network for board games. It's a Python project that uses Tensorflow. Leela Chess Zero looks similar and might be easier to get running.

donkirkby commented 8 months ago

Leela Chess Zero turned out to have a very complicated model, just over 150 layers or components!

Model Summary ``` Model: "model" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) [(None, 112, 8, 8)] 0 __________________________________________________________________________________________________ input/conv2d (Conv2D) (None, 64, 8, 8) 64512 input_1[0][0] __________________________________________________________________________________________________ input/bn (BatchNormalization) (None, 64, 8, 8) 256 input/conv2d[0][0] __________________________________________________________________________________________________ activation (Activation) (None, 64, 8, 8) 0 input/bn[0][0] __________________________________________________________________________________________________ residual_1/1/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation[0][0] __________________________________________________________________________________________________ residual_1/1/bn (BatchNormaliza (None, 64, 8, 8) 192 residual_1/1/conv2d[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 64, 8, 8) 0 residual_1/1/bn[0][0] __________________________________________________________________________________________________ residual_1/2/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_1[0][0] __________________________________________________________________________________________________ residual_1/2/bn (BatchNormaliza (None, 64, 8, 8) 256 residual_1/2/conv2d[0][0] __________________________________________________________________________________________________ global_average_pooling2d (Globa (None, 64) 0 residual_1/2/bn[0][0] __________________________________________________________________________________________________ residual_1/se/se/dense1 (Dense) (None, 32) 2080 global_average_pooling2d[0][0] __________________________________________________________________________________________________ activation_2 (Activation) (None, 32) 0 residual_1/se/se/dense1[0][0] __________________________________________________________________________________________________ residual_1/se/se/dense2 (Dense) (None, 128) 4224 activation_2[0][0] __________________________________________________________________________________________________ apply_squeeze_excitation (Apply (None, 64, 8, 8) 0 residual_1/2/bn[0][0] residual_1/se/se/dense2[0][0] __________________________________________________________________________________________________ add (Add) (None, 64, 8, 8) 0 activation[0][0] apply_squeeze_excitation[0][0] __________________________________________________________________________________________________ activation_3 (Activation) (None, 64, 8, 8) 0 add[0][0] __________________________________________________________________________________________________ residual_2/1/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_3[0][0] __________________________________________________________________________________________________ residual_2/1/bn (BatchNormaliza (None, 64, 8, 8) 192 residual_2/1/conv2d[0][0] __________________________________________________________________________________________________ activation_4 (Activation) (None, 64, 8, 8) 0 residual_2/1/bn[0][0] __________________________________________________________________________________________________ residual_2/2/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_4[0][0] __________________________________________________________________________________________________ residual_2/2/bn (BatchNormaliza (None, 64, 8, 8) 256 residual_2/2/conv2d[0][0] __________________________________________________________________________________________________ global_average_pooling2d_1 (Glo (None, 64) 0 residual_2/2/bn[0][0] __________________________________________________________________________________________________ residual_2/se/se/dense1 (Dense) (None, 32) 2080 global_average_pooling2d_1[0][0] __________________________________________________________________________________________________ activation_5 (Activation) (None, 32) 0 residual_2/se/se/dense1[0][0] __________________________________________________________________________________________________ residual_2/se/se/dense2 (Dense) (None, 128) 4224 activation_5[0][0] __________________________________________________________________________________________________ apply_squeeze_excitation_1 (App (None, 64, 8, 8) 0 residual_2/2/bn[0][0] residual_2/se/se/dense2[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 64, 8, 8) 0 activation_3[0][0] apply_squeeze_excitation_1[0][0] __________________________________________________________________________________________________ activation_6 (Activation) (None, 64, 8, 8) 0 add_1[0][0] __________________________________________________________________________________________________ residual_3/1/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_6[0][0] __________________________________________________________________________________________________ residual_3/1/bn (BatchNormaliza (None, 64, 8, 8) 192 residual_3/1/conv2d[0][0] __________________________________________________________________________________________________ activation_7 (Activation) (None, 64, 8, 8) 0 residual_3/1/bn[0][0] __________________________________________________________________________________________________ residual_3/2/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_7[0][0] __________________________________________________________________________________________________ residual_3/2/bn (BatchNormaliza (None, 64, 8, 8) 256 residual_3/2/conv2d[0][0] __________________________________________________________________________________________________ global_average_pooling2d_2 (Glo (None, 64) 0 residual_3/2/bn[0][0] __________________________________________________________________________________________________ residual_3/se/se/dense1 (Dense) (None, 32) 2080 global_average_pooling2d_2[0][0] __________________________________________________________________________________________________ activation_8 (Activation) (None, 32) 0 residual_3/se/se/dense1[0][0] __________________________________________________________________________________________________ residual_3/se/se/dense2 (Dense) (None, 128) 4224 activation_8[0][0] __________________________________________________________________________________________________ apply_squeeze_excitation_2 (App (None, 64, 8, 8) 0 residual_3/2/bn[0][0] residual_3/se/se/dense2[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 64, 8, 8) 0 activation_6[0][0] apply_squeeze_excitation_2[0][0] __________________________________________________________________________________________________ activation_9 (Activation) (None, 64, 8, 8) 0 add_2[0][0] __________________________________________________________________________________________________ residual_4/1/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_9[0][0] __________________________________________________________________________________________________ residual_4/1/bn (BatchNormaliza (None, 64, 8, 8) 192 residual_4/1/conv2d[0][0] __________________________________________________________________________________________________ activation_10 (Activation) (None, 64, 8, 8) 0 residual_4/1/bn[0][0] __________________________________________________________________________________________________ residual_4/2/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_10[0][0] __________________________________________________________________________________________________ residual_4/2/bn (BatchNormaliza (None, 64, 8, 8) 256 residual_4/2/conv2d[0][0] __________________________________________________________________________________________________ global_average_pooling2d_3 (Glo (None, 64) 0 residual_4/2/bn[0][0] __________________________________________________________________________________________________ residual_4/se/se/dense1 (Dense) (None, 32) 2080 global_average_pooling2d_3[0][0] __________________________________________________________________________________________________ activation_11 (Activation) (None, 32) 0 residual_4/se/se/dense1[0][0] __________________________________________________________________________________________________ residual_4/se/se/dense2 (Dense) (None, 128) 4224 activation_11[0][0] __________________________________________________________________________________________________ apply_squeeze_excitation_3 (App (None, 64, 8, 8) 0 residual_4/2/bn[0][0] residual_4/se/se/dense2[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 64, 8, 8) 0 activation_9[0][0] apply_squeeze_excitation_3[0][0] __________________________________________________________________________________________________ activation_12 (Activation) (None, 64, 8, 8) 0 add_3[0][0] __________________________________________________________________________________________________ residual_5/1/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_12[0][0] __________________________________________________________________________________________________ residual_5/1/bn (BatchNormaliza (None, 64, 8, 8) 192 residual_5/1/conv2d[0][0] __________________________________________________________________________________________________ activation_13 (Activation) (None, 64, 8, 8) 0 residual_5/1/bn[0][0] __________________________________________________________________________________________________ residual_5/2/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_13[0][0] __________________________________________________________________________________________________ residual_5/2/bn (BatchNormaliza (None, 64, 8, 8) 256 residual_5/2/conv2d[0][0] __________________________________________________________________________________________________ global_average_pooling2d_4 (Glo (None, 64) 0 residual_5/2/bn[0][0] __________________________________________________________________________________________________ residual_5/se/se/dense1 (Dense) (None, 32) 2080 global_average_pooling2d_4[0][0] __________________________________________________________________________________________________ activation_14 (Activation) (None, 32) 0 residual_5/se/se/dense1[0][0] __________________________________________________________________________________________________ residual_5/se/se/dense2 (Dense) (None, 128) 4224 activation_14[0][0] __________________________________________________________________________________________________ apply_squeeze_excitation_4 (App (None, 64, 8, 8) 0 residual_5/2/bn[0][0] residual_5/se/se/dense2[0][0] __________________________________________________________________________________________________ add_4 (Add) (None, 64, 8, 8) 0 activation_12[0][0] apply_squeeze_excitation_4[0][0] __________________________________________________________________________________________________ activation_15 (Activation) (None, 64, 8, 8) 0 add_4[0][0] __________________________________________________________________________________________________ residual_6/1/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_15[0][0] __________________________________________________________________________________________________ residual_6/1/bn (BatchNormaliza (None, 64, 8, 8) 192 residual_6/1/conv2d[0][0] __________________________________________________________________________________________________ activation_16 (Activation) (None, 64, 8, 8) 0 residual_6/1/bn[0][0] __________________________________________________________________________________________________ residual_6/2/conv2d (Conv2D) (None, 64, 8, 8) 36864 activation_16[0][0] __________________________________________________________________________________________________ residual_6/2/bn (BatchNormaliza (None, 64, 8, 8) 256 residual_6/2/conv2d[0][0] __________________________________________________________________________________________________ global_average_pooling2d_5 (Glo (None, 64) 0 residual_6/2/bn[0][0] __________________________________________________________________________________________________ residual_6/se/se/dense1 (Dense) (None, 32) 2080 global_average_pooling2d_5[0][0] __________________________________________________________________________________________________ activation_17 (Activation) (None, 32) 0 residual_6/se/se/dense1[0][0] __________________________________________________________________________________________________ residual_6/se/se/dense2 (Dense) (None, 128) 4224 activation_17[0][0] __________________________________________________________________________________________________ apply_squeeze_excitation_5 (App (None, 64, 8, 8) 0 residual_6/2/bn[0][0] residual_6/se/se/dense2[0][0] __________________________________________________________________________________________________ add_5 (Add) (None, 64, 8, 8) 0 activation_15[0][0] apply_squeeze_excitation_5[0][0] __________________________________________________________________________________________________ activation_18 (Activation) (None, 64, 8, 8) 0 add_5[0][0] __________________________________________________________________________________________________ tf.compat.v1.transpose (TFOpLam (None, 8, 8, 64) 0 activation_18[0][0] __________________________________________________________________________________________________ tf.reshape (TFOpLambda) (None, 64, 64) 0 tf.compat.v1.transpose[0][0] __________________________________________________________________________________________________ policy/embedding (Dense) (None, 64, 64) 4160 tf.reshape[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/mha/wq (Dens (None, 64, 64) 4160 policy/embedding[0][0] __________________________________________________________________________________________________ tf.compat.v1.shape (TFOpLambda) (3,) 0 policy/enc_layer_1/mha/wq[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem (Slici () 0 tf.compat.v1.shape[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/mha/wk (Dens (None, 64, 64) 4160 policy/embedding[0][0] __________________________________________________________________________________________________ tf.reshape_2 (TFOpLambda) (None, 64, 4, 16) 0 policy/enc_layer_1/mha/wk[0][0] tf.__operators__.getitem[0][0] __________________________________________________________________________________________________ tf.compat.v1.transpose_2 (TFOpL (None, 4, 64, 16) 0 tf.reshape_2[0][0] __________________________________________________________________________________________________ tf.compat.v1.shape_1 (TFOpLambd (4,) 0 tf.compat.v1.transpose_2[0][0] __________________________________________________________________________________________________ tf.reshape_1 (TFOpLambda) (None, 64, 4, 16) 0 policy/enc_layer_1/mha/wq[0][0] tf.__operators__.getitem[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_1 (Sli () 0 tf.compat.v1.shape_1[0][0] __________________________________________________________________________________________________ tf.compat.v1.transpose_1 (TFOpL (None, 4, 64, 16) 0 tf.reshape_1[0][0] __________________________________________________________________________________________________ tf.cast (TFOpLambda) () 0 tf.__operators__.getitem_1[0][0] __________________________________________________________________________________________________ tf.linalg.matmul (TFOpLambda) (None, 4, 64, 64) 0 tf.compat.v1.transpose_1[0][0] tf.compat.v1.transpose_2[0][0] __________________________________________________________________________________________________ tf.math.sqrt (TFOpLambda) () 0 tf.cast[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/mha/wv (Dens (None, 64, 64) 4160 policy/embedding[0][0] __________________________________________________________________________________________________ tf.math.truediv (TFOpLambda) (None, 4, 64, 64) 0 tf.linalg.matmul[0][0] tf.math.sqrt[0][0] __________________________________________________________________________________________________ tf.reshape_3 (TFOpLambda) (None, 64, 4, 16) 0 policy/enc_layer_1/mha/wv[0][0] tf.__operators__.getitem[0][0] __________________________________________________________________________________________________ tf.nn.softmax (TFOpLambda) (None, 4, 64, 64) 0 tf.math.truediv[0][0] __________________________________________________________________________________________________ tf.compat.v1.transpose_3 (TFOpL (None, 4, 64, 16) 0 tf.reshape_3[0][0] __________________________________________________________________________________________________ tf.linalg.matmul_1 (TFOpLambda) (None, 4, 64, 16) 0 tf.nn.softmax[0][0] tf.compat.v1.transpose_3[0][0] __________________________________________________________________________________________________ tf.compat.v1.transpose_4 (TFOpL (None, 64, 4, 16) 0 tf.linalg.matmul_1[0][0] __________________________________________________________________________________________________ tf.reshape_4 (TFOpLambda) (None, None, 64) 0 tf.compat.v1.transpose_4[0][0] tf.__operators__.getitem[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/mha/dense (D (None, None, 64) 4160 tf.reshape_4[0][0] __________________________________________________________________________________________________ tf.math.multiply (TFOpLambda) (None, 64, 64) 0 policy/embedding[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/dropout1 (Dr (None, None, 64) 0 policy/enc_layer_1/mha/dense[0][0 __________________________________________________________________________________________________ tf.__operators__.add (TFOpLambd (None, 64, 64) 0 tf.math.multiply[0][0] policy/enc_layer_1/dropout1[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/ln1 (LayerNo (None, 64, 64) 128 tf.__operators__.add[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/ffn/dense1 ( (None, 64, 128) 8320 policy/enc_layer_1/ln1[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/ffn/dense2 ( (None, 64, 64) 8256 policy/enc_layer_1/ffn/dense1[0][ __________________________________________________________________________________________________ tf.math.multiply_1 (TFOpLambda) (None, 64, 64) 0 policy/enc_layer_1/ln1[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/dropout2 (Dr (None, 64, 64) 0 policy/enc_layer_1/ffn/dense2[0][ __________________________________________________________________________________________________ tf.__operators__.add_1 (TFOpLam (None, 64, 64) 0 tf.math.multiply_1[0][0] policy/enc_layer_1/dropout2[0][0] __________________________________________________________________________________________________ policy/enc_layer_1/ln2 (LayerNo (None, 64, 64) 128 tf.__operators__.add_1[0][0] __________________________________________________________________________________________________ policy/attention/wk (Dense) (None, 64, 64) 4160 policy/enc_layer_1/ln2[0][0] __________________________________________________________________________________________________ tf.compat.v1.shape_2 (TFOpLambd (3,) 0 policy/attention/wk[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_2 (Sli () 0 tf.compat.v1.shape_2[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_3 (Sli (None, 8, 64) 0 policy/attention/wk[0][0] __________________________________________________________________________________________________ tf.cast_1 (TFOpLambda) () 0 tf.__operators__.getitem_2[0][0] __________________________________________________________________________________________________ policy/attention/ppo (Dense) (None, 8, 4) 256 tf.__operators__.getitem_3[0][0] __________________________________________________________________________________________________ tf.math.sqrt_1 (TFOpLambda) () 0 tf.cast_1[0][0] __________________________________________________________________________________________________ tf.compat.v1.transpose_5 (TFOpL (None, 4, 8) 0 policy/attention/ppo[0][0] __________________________________________________________________________________________________ tf.math.multiply_2 (TFOpLambda) (None, 4, 8) 0 tf.compat.v1.transpose_5[0][0] tf.math.sqrt_1[0][0] __________________________________________________________________________________________________ policy/attention/wq (Dense) (None, 64, 64) 4160 policy/enc_layer_1/ln2[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_4 (Sli (None, 3, 8) 0 tf.math.multiply_2[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_5 (Sli (None, 1, 8) 0 tf.math.multiply_2[0][0] __________________________________________________________________________________________________ tf.linalg.matmul_2 (TFOpLambda) (None, 64, 64) 0 policy/attention/wq[0][0] policy/attention/wk[0][0] __________________________________________________________________________________________________ tf.__operators__.add_2 (TFOpLam (None, 3, 8) 0 tf.__operators__.getitem_4[0][0] tf.__operators__.getitem_5[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_6 (Sli (None, 8, 8) 0 tf.linalg.matmul_2[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_7 (Sli (None, 1, 8) 0 tf.__operators__.add_2[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_8 (Sli (None, 1, 8) 0 tf.__operators__.add_2[0][0] __________________________________________________________________________________________________ tf.__operators__.getitem_9 (Sli (None, 1, 8) 0 tf.__operators__.add_2[0][0] __________________________________________________________________________________________________ tf.__operators__.add_3 (TFOpLam (None, 8, 8) 0 tf.__operators__.getitem_6[0][0] tf.__operators__.getitem_7[0][0] __________________________________________________________________________________________________ tf.__operators__.add_4 (TFOpLam (None, 8, 8) 0 tf.__operators__.getitem_6[0][0] tf.__operators__.getitem_8[0][0] __________________________________________________________________________________________________ tf.__operators__.add_5 (TFOpLam (None, 8, 8) 0 tf.__operators__.getitem_6[0][0] tf.__operators__.getitem_9[0][0] __________________________________________________________________________________________________ value/conv2d (Conv2D) (None, 32, 8, 8) 2048 activation_18[0][0] __________________________________________________________________________________________________ moves_left/conv2d (Conv2D) (None, 8, 8, 8) 512 activation_18[0][0] __________________________________________________________________________________________________ tf.expand_dims (TFOpLambda) (None, 8, 8, 1) 0 tf.__operators__.add_3[0][0] __________________________________________________________________________________________________ tf.expand_dims_1 (TFOpLambda) (None, 8, 8, 1) 0 tf.__operators__.add_4[0][0] __________________________________________________________________________________________________ tf.expand_dims_2 (TFOpLambda) (None, 8, 8, 1) 0 tf.__operators__.add_5[0][0] __________________________________________________________________________________________________ value/bn (BatchNormalization) (None, 32, 8, 8) 96 value/conv2d[0][0] __________________________________________________________________________________________________ moves_left/bn (BatchNormalizati (None, 8, 8, 8) 24 moves_left/conv2d[0][0] __________________________________________________________________________________________________ tf.concat (TFOpLambda) (None, 8, 8, 3) 0 tf.expand_dims[0][0] tf.expand_dims_1[0][0] tf.expand_dims_2[0][0] __________________________________________________________________________________________________ activation_19 (Activation) (None, 32, 8, 8) 0 value/bn[0][0] __________________________________________________________________________________________________ activation_20 (Activation) (None, 8, 8, 8) 0 moves_left/bn[0][0] __________________________________________________________________________________________________ tf.reshape_5 (TFOpLambda) (None, 8, 24) 0 tf.concat[0][0] __________________________________________________________________________________________________ flatten (Flatten) (None, 2048) 0 activation_19[0][0] __________________________________________________________________________________________________ flatten_1 (Flatten) (None, 512) 0 activation_20[0][0] __________________________________________________________________________________________________ tf.math.truediv_2 (TFOpLambda) (None, 64, 64) 0 tf.linalg.matmul_2[0][0] tf.math.sqrt_1[0][0] __________________________________________________________________________________________________ tf.math.truediv_1 (TFOpLambda) (None, 8, 24) 0 tf.reshape_5[0][0] tf.math.sqrt_1[0][0] __________________________________________________________________________________________________ value/dense1 (Dense) (None, 128) 262272 flatten[0][0] __________________________________________________________________________________________________ moves_left/dense1 (Dense) (None, 128) 65664 flatten_1[0][0] __________________________________________________________________________________________________ apply_attention_policy_map (App (None, 1858) 0 tf.math.truediv_2[0][0] tf.math.truediv_1[0][0] __________________________________________________________________________________________________ value/dense2 (Dense) (None, 3) 387 value/dense1[0][0] __________________________________________________________________________________________________ moves_left/dense2 (Dense) (None, 1) 129 moves_left/dense1[0][0] ================================================================================================== Total params: 924,988 Trainable params: 923,244 Non-trainable params: 1,744 __________________________________________________________________________________________________ ```

Maybe this tensorflow chess tutorial will be easier to learn from.

donkirkby commented 8 months ago

That tutorial was missing the input data, but this one might be better.

donkirkby commented 7 months ago

Found a Reversi Alpha Zero project that should be a good example.