Basic MCTS implementation

markpattison / Reversi

8 stars 1 forks source link

Basic MCTS implementation #1

Closed forki closed 4 years ago

forki commented 4 years ago

This adds a basics MCTS player (see https://en.wikipedia.org/wiki/Monte_Carlo_tree_search). I let it ran against the minimax level 2 engine that was configured in the tests and MCTS won 20 out of 20 games. So that looks promising.

I needed to make couple of the internals available for the engine.
I change the test framework to expecto because that was way easier for me to run in watch mode. I can change that back if needed.

isaacabraham commented 4 years ago

I was wondering (maybe this is off-topic) that to increase perf (something that @markpattison mentioned) it would be possible to offload the compute to the back-end which would be running on netcore?

forki commented 4 years ago

Someone would need to pay for that ;)

Tbf chess engines like stockfish run in the browser and are super super fast.

forki commented 4 years ago

one of the reasons why it's slow on fable is that we do a lot of allocations. look at:

https://github.com/markpattison/Reversi/blob/master/src/Reversi/Board.fs#L161-L171

This is creating an array and tests for empty. We should actually just not allocate here at all. Just a flag and early return should do. Also inside of isPossibleMove there are lots and lots of allocations because it's trying to calc the flips on the fly. But here it's not needed at all.

forki commented 4 years ago

I will send a PR later

markpattison commented 4 years ago

@forki I thought you might be interested to know that the results are a bit more competitive now that I:

fixed a silly bug in the minimax evaluation function
adjusted the number of playouts used by the MCTS player so that it spends roughly as much time thinking as the Minimax one.

Minimax depth 2 vs MCTS with 15 playouts per turn

Minimax, Basic heuristic, depth 2 (avg. time 2.29): 3 MCTS, 15 playouts per turn (avg. time 3.04): 7 ties: 0

Minimax depth 3 vs MCTS with 200 playouts per turn

Minimax, Basic heuristic, depth 3 (avg. time 45.87): 4 MCTS, 200 playouts per turn (avg. time 47.51): 5 ties: 1

forki commented 4 years ago

Very good!

Now the constant playout rate needs to be adjusted. Maybe I'll find time to do that.