maxpumperla / ScalphaGoZero

An independent implementation of DeepMind's AlphaGoZero in Scala, using Deeplearning4J (DL4J)
Apache License 2.0
156 stars 24 forks source link

Get ScalphaGo working for 5x5 boards #23

Closed barrybecker4 closed 5 years ago

barrybecker4 commented 5 years ago

What changes were proposed in this pull request?

I apologize for the size of this PR. I just felt like I needed to get to a state where it was at least learning correctly for 3x3 and 5x5 boards before submitting something. I read the book while I worked on the code. At times I was a bit confused because the algorithm described in the book keeps evolving in successive chapters. The current implementation should be close to what is described in the final section.

Here is a summary of some of the major changes introduced in this PR. Please read the commit comments for more info about small bugs that were fixed along the way. The below issue numbers are from my fork.

Using MC playouts instead of just looking up the model score to evaluate a board makes it learn much more slowly, but I just wasn't getting reasonable results using the model score. Using MC playouts is the way that alphago-zero works though - so I think its the way to go.

I'll understand if you don't want to accept these changes since some things changed quite a bit. I am mainly doing this just for my own enjoyment. I really appreciate the effort you put into starting these projects and writing the book.

How was this patch tested?

I added a lot of unit tests, and they should all be passing. Some of the tests use a model_size_5_layers_2_test.model file that has been trained using a few thousand 5x5 games. You can actually specify this model when you run, and see it play reasonably well. I'm sure it will improve as I continue to train it. It has only been trained on a few thousand games so far. I have trained models for 3x3 and 5x5 boards that play well against a human opponent. One of the main things I was looking for was that the model would learn to play the initial move in the center. It does now.

maxpumperla commented 5 years ago

@barrybecker4 this is great. I did this project for my own enjoyment, too. This Scala repo is independent of the officially supported Python package that comes with the book. So I don't mind larger changes. In fact, I was first porting this project very literally from Python, until a few Scala snobs came along to show me the light. :D Thanks for putting in all the good work!

maxpumperla commented 5 years ago

@barrybecker4 p.s.: if you want to help out in OSS land, you can become a maintainer of this project. You very clearly know your way around it now and it's not a lot of work, as we don't get many issues etc.

What do you think?

barrybecker4 commented 5 years ago

It's fine if with me if you want to add me as a maintainer. I have limited time on weekends to look at it, but I could help to review and merge changes.