lightvector / KataGo

GTP engine and self-play learning in Go
https://katagotraining.org/
Other
3.6k stars 569 forks source link

Student Research Project Help #343

Open MarkTakken opened 4 years ago

MarkTakken commented 4 years ago

Hi. I’m a high school Go and AI enthusiast, and I am working on a KataGo-related project in the context of a three-year high school science research class.  I would greatly appreciate assistance with training KataGo and editing the code for the purposes described below.

The goal of the project in the near future is to train an agent to play Go, change the structure of the convolutions in the neural net (but not the weights!) to play Toroidal Go, and compare the strength of play to that of an agent trained specifically to play Toroidal Go.  (Toroidal Go is played by connecting each point on the edge of the board to the point on the opposite side.)  For the past several months I have been working off of the implementation Alpha-Zero-General (https://github.com/suragnair/alpha-zero-general) due to its simplicity.  However, I have found the training to be poor and much too slow.

Consequently, I now plan to try to use KataGo instead because of its fast and sophisticated training.  However, despite having read through the instructions for self-play training found in SelfplayTraining.md, I am rather at a loss for where and how to train KataGo.  (Probably I should use Google Cloud, but I do not yet know how to use it.) In addition, although I have a vague idea of where to change the code for training on Toroidal Go, I do not currently see how to change the neural net appropriately.  Thus, I would very grateful if someone could help me with the following:

  1. Where and how to train KataGo.
  2. Changing the neural net structure for Toroidal Go.  In particular, my idea is to change the convolutions so that they, instead of padding the inputs with a ring of 0s, use wrap-around padding.  I have an intuition that this should work, because the only difference between Toroidal Go and Go is the wrap-around board structure of the former.
  3. Changing the coding for the game rules.  All that I have to change is the code dictating when two points are adjacent and how to generate board symmetries. 

Thanks.

lightvector commented 4 years ago

Hi Mark,

Glad you're interested! I'm pretty sure someone out there has attempted something like this and even asked me for questions and help before, but I'm coming up a bit blank when I try to search for it and I don't remember where it was - discord, or forums, or something like that, and I don't recall how far they got or if they released their results. So regardless, I guess I'll just answer fresh here:

Before you do anything else, you should learn to compile KataGo on your own, and then you should attempt to run a full training loop (such as on a 9x9 board) with KataGo exactly as it is right now. Instructions to compile are at https://github.com/lightvector/KataGo#compiling-katago and as you mentioned, instructions for running the whole training loop are at https://github.com/lightvector/KataGo/blob/master/SelfplayTraining.md.

There's no need to be "at a loss for where and how" - just take it step by step. Acquire a machine with a suitable GPU (e.g. a 2080 Ti) or rent one from online. Install CMake and the necessary dependencies. Try to build KataGo. Try running the basic commands like ./katago benchmark and ./katago gtp and seeing if they're working. Then, start running some of the example commands describe in the training document like ./katago selfplay and see how they behave. Once you have some data, install Tensorflow 1.5 and see if you can run the train.py script. And so on.

If you don't know how to "use Google Cloud", then again, you can solve that just by being proactive and trying things, breaking it down step by step. First, search online for the place where you sign up. Click the buttons in your browser to make an account sign up. Provide a credit card or whatever means you plan to pay and also claim your 300$ of free credit. Find and follow a tutorial for how to start a machine and ssh into it. Etc.

lightvector commented 4 years ago

Or of course, if you already have a home machine with a decent GPU, then you can just use that. An advantage of cloud machines is that if you screw something up terribly, you can toss it out and start over, but of course working on them is also much less convenient than a home machine, particularly if you're not as familiar with ssh and working in a terminal.

So I guess that's the answer to your # 1 - just get busy and start trying things.

As for # 2 and # 3 ... upcoming reply.

lightvector commented 4 years ago

So, this is not going to be a trivial project, but if you're anticipating a timeframe for this project on the order of many months, of course that's enough time to accomplish quite a lot if you're focused and proactive. :)

Although, actually the set of changes should be doable within days to a couple of weeks if you know where to look and you are generally already experienced with coding and aren't struggling with things like "how do I run scripts" and "how do I install things" and "how do I set up a GPU on the cloud". Or, I suppose "how is KataGo's code laid out and how does it fit together?". :)

Assuming you've gotten the basic training working with an unmodified KataGo, then you can start modifying it. For changing the neural net structure, you'll want to focus on one of KataGo's backends. KataGo has 3 backends, OpenCL (broad range gpu), CUDA (nvidia-specific gpu), and Eigen (cpu). Eigen is much too slow to make this work, so I'd recommend OpenCL unless CUDA's CUDNN library has recently added built-in support for toroidal boundary conditions.

You'll have to get your hands dirty with the actual GPU code, since KataGo directly uses these lower-level GPU libraries including having custom GPU kernels, rather than being layered on top of a higher-level library. For OpenCL, the code is in cpp/neuralnet/opencl*. Mainly, you'll want to make the convolution have a toroidal boundary condition as you said. So for example, Take a look at https://github.com/lightvector/KataGo/blob/master/cpp/neuralnet/openclkernels.cpp#L259

This is one of the custom OpenCL kernels in KataGo - it performs the winograd transformation for a convolution. Convolution is implemented as a winograd transform, then a matrix multiply, then a winograd untransform (following this landmark paper https://arxiv.org/abs/1509.09308 that tons of ML libraries are based on, if you're curious, you don't necessarily need to understand it though). Near line 259 above is where the "zero padding" condition is implemented as the winograd transform loads a small block of the tensor from GPU memory into a local tile. Whenever the indices would be out-of-bounds, the if statement evaluates to false, so the value variable stays ZERO, otherwise it loads the value. You'd change this code so that it instead loads the wraparound index if X or Y are out of bounds.

You should also make sure you use a fixed board size (e.g. 19x19) and make sure the neural net tensor dimensions are is being sized to exactly that size. nnXLen and nnYLen (https://github.com/lightvector/KataGo/blob/master/cpp/neuralnet/nneval.h#L170) are the spatial dimensions of the tensors on the GPU, which depending on the usage of KataGo could be larger than the board size (https://github.com/lightvector/KataGo/blob/master/cpp/game/board.h#L262). KataGo supports mixing evals for nets of boards of all different sizes into the same tensor for the same neural net batch, so as to be able to play on multiple board sizes at once. And uses the masking mechanism describe in section 4.3 here https://arxiv.org/pdf/1902.10565v1.pdf to make it behave like each batch element is zero-padded according to its own spatial dimension even despite being embedded within a larger tensor.

Your modifications to the GPU kernel are probably not going to be so complex as to handle the case where you need to wrap at an earlier point than the actual tensor dimension (when the board size is smaller than the tensor size). So if you plan to test on a size smaller than 19x19, you would probably fix a single size permanently for the entire run, and it would be good to temporarily add some print statements or asserts to verify that when you use KataGo to generate selfplay data, the tensors are getting sized to exactly that board size rather than larger.

lightvector commented 4 years ago

The board implementation is here: https://github.com/lightvector/KataGo/blob/master/cpp/game/board.cpp

You might notice a conspicuous absence of any bounds checks in the code. KataGo uses a little trick to avoid bounds checks in all of the board logic for tracing groups, capturing stones, etc. It simply sits the board within a buffer that is slightly larger than the board in each dimension, and uses a fourth value "WALL" to mark points that are out of bounds, instead of "EMPTY", "BLACK" or "WHITE". Then, almost all algorithms sort of just work automatically - e.g. if you're iterating around a point to count "EMPTY" spaces as a liberty count, well, "EMPTY" is not "WALL", so when you reach out of bounds it all just works without an explicit bounds check.

While this makes the code super nice for regular Go, it means you'll have to actually comb through and find the places where bounds checks would have been needed, since the "WALL" mechanism won't work for you - you need to actually have the bounds check so you can know when to wrap to the other side of the board.

If in the process you find it hard to convert the ladder solver code or its helper functions, or things like that, you can feel free to disable those features. They're inputs to the neural net, but they're not used much besides that, you could comment them out and instead pass in all 0 for those channels if you don't want to convert them: https://github.com/lightvector/KataGo/blob/master/cpp/neuralnet/nninputs.cpp#L2154

The last place you'll have to change is in Tensorflow - the training side of things, outside of KataGo, will need a toroidal boundary condition. This should just be some simple tensor slicing to construct the wraparound padding manually, and adjusting how you call TensorFlow's conv2d to no longer do its own padding for you: https://github.com/lightvector/KataGo/blob/master/python/model.py#L604

lightvector commented 4 years ago

Also, if you want a brief high-level guide to the C++ code, take a look at the bottom half of this page, to get oriented as to what code lives where:

https://github.com/lightvector/KataGo/tree/master/cpp

Hope that helps!

lightvector commented 4 years ago

Edit: fixed some typos, minor edits in the above posts, which probably won't show up if you're reading this in email, but will show up if you're reading this in a browser.