ihavnoid / leelaz-ninenine

A 9x9 fork of leela-zero, which is targeted to provide a learning exercise with limited resources
GNU General Public License v3.0
25 stars 8 forks source link

What

This is a fork of "leela-zero" which tries to reduce the problem size into something that can be trained by people or organizations with limited resource (e.g., budget of less than $5,000 USD, or even a single desktop if you are patient enough). In essence, this is a tweaked version which is essentially a 9x9 go, plus some other tweaks as necessary.

If you want the full 19x19 version, please go to http://github.com/gcp/leela-zero/ and look for a good weight file from http://zero.sjeng.org/.

So, why the fork?

As stated on the leela-zero README file, training a network for a 19x19 go will take thousands of years' worth of compute time, unless you are some mega-corporation with millions of servers or with some dedicated chip design team. This is not ideal for people to learn and try new ideas. The goal is to provide a smaller problem set that may be reachable within an individual or a small university research group.

Compiling

Requirements

Example of compiling and running - Ubuntu

# Test for OpenCL support & compatibility
sudo apt install clinfo && clinfo

# Clone github repo
git clone https://github.com/ihavnoid/leelaz-ninenine/
cd leelaz-ninenine/src
sudo apt install libboost-all-dev libopenblas-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev zlib1g-dev
make
cd ..

# An untrained, random neural net parameter is provided.  This really plays random moves.
src/leelaz --weights example-data/initial_9x9.txt

Weights format

The weights file is a text file with each line containing a row of coefficients. The layout of the network is as in the AlphaGo Zero paper (except being a 9x9 board), but any number of residual blocks is allowed, and any number of outputs (filters) per layer, as long as the latter is the same for all layers. The program will autodetect the amounts on startup. The first line contains a version number.

There are 18 inputs to the first layer, instead of 17 as in the paper. The original AlphaGo Zero design has a slight imbalance in that it is easier for the black player to see the board edge (due to how padding works in neural networks). This has been fixed in Leela Zero. The inputs are:

1) Side to move stones at time T=0
2) Side to move stones at time T=-1  (0 if T=0)
...
8) Side to move stones at time T=-7  (0 if T<=6)
9) Other side stones at time T=0
10) Other side stones at time T=-1   (0 if T=0)
...
16) Other side stones at time T=-7   (0 if T<=6)
17) All 1 if black is to move, 0 otherwise
18) All 1 if white is to move, 0 otherwise

Each of these forms a 9 x 9 bit plane.

The training/tf directory contains a 10 residual block version in TensorFlow format, in the tfprocess.py file.

Training

Getting the data

In addition to the methods described on Leela-zero README file, leelaz-ninenine has an autotrain command which automatically runs a given number of games and then stores the training data with the given name.

See minitrain.sh for an example how this command is used.

Training data format

The training data consists of files with the following data, all in text format:

Training the data

The training/tf path contains the TensorFlow code which reads the latest 20% of the data available from the given glob, runs 12000 batches, saves the state, and auto-terminates. The intention is that we keep add more data from the self-play (minitrain.sh script) by restarting the training sequence every 12000 batches. See training/tf/trainpipe.sh for an example.

Related links

License

The code is released under the GPLv3 or later, except for ThreadPool.h, cl2.hpp and the clblast_level3 subdir, which have specific licenses (compatible with GPLv3) mentioned in those files.