monte carlo + resnet implementation of 2048

all,

I watched a fascinating video about deepmind's alphazero implementation here:

https://www.youtube.com/watch?v=Wujy7OzvdJk

which finally answered for me what was bugging me for some time - how they actually got a neural net to play go/chess/shogi without any human input. I thought for sure it would follow previous nets and the system would get stuck on a local maxima fairly quickly.

the trick? basically use monte carlo to generate the training data - ie: the neural net + monte carlo would pick the moves, and then be used by that neural network as training data.

then the next generation of the neural network would be inherently stronger, which would then use monte carlo to learn further, and so on.

in essence, they were baking the results of the monte carlo trials into the neural net itself. Exceedingly clever.

In any case, I was wondering if anyone had implemented a 2048 bot using this strategy, and if so, what were the results.

nneonneo / 2048-ai

monte carlo + resnet implementation of 2048 #51