Closed edre closed 3 years ago
Now that my perft ran a little longer in the Base game, I have a divergence from your results: At depth 9, my engine gets 99,376,027,356 moves, and Mzinga gets 99,375,893,436 moves. When I get around to implementing UHP, I can try to randomly sample game states at depth 8, and see if our engines get different moves.
Here's my proposal of a perft test that exercises how various pieces interact with beetle gates. A brief game that constructs a beetle gate as quickly as possible.
GameString: Base+MLP;InProgress;White[5];wB1;bB1 -wB1;wQ wB1-;bQ /bB1;wB2 wQ-;bQ /wB1;wB2 wQ;bB1 bQ
Here's my initial proposal for perft values. If you could add a feature to Mzinga.Perft to accept arbitrary game strings we can compare. I manually verified that we agree about depth 1 at least.
depth | count | time | kn/s |
---|---|---|---|
0 | 1 | 1.05µs | 952 |
1 | 36 | 11.075µs | 3250 |
2 | 965 | 108.989µs | 8854 |
3 | 44268 | 3.980584ms | 11120 |
4 | 1747280 | 192.65638ms | 9069 |
5 | 93187480 | 11.69154456s | 7970 |
6 | 4489053697 | 628.700924372s | 7140 |
The perft timings should be just from literally calling playing every possible move as if someone was doing it.
In order to find out the perft results faster, versions of Mzinga up to and including 0.10.0 sped things up by starting multiple threads at the starting position, with each first possible move given it's own thread in parallel. This gave me the perft counts faster, but the timings are invalid. I removed that functionality with https://github.com/jonthysell/Mzinga/commit/e887b75847d50fa9226fe58feffdc9b75cb85681.
If your actual move generator does things in parallel internally, that would be a valid.
The perft move divergence is probably from the Spider move generation bug #102.
Also, you can test different positions using MzingaEngine, which includes a perft command. Just make sure you disable pondering if you're worried about the timings. ie:
options set PonderDuringIdle Disabled
newgame Base+MLP;InProgress;White[5];wB1;bB1 -wB1;wQ wB1-;bQ /bB1;wB2 wQ-;bQ /wB1;wB2 wQ;bB1 bQ
perft 6
Here's the Mzinga v0.10.1 results:
perft(0) = 1 in 2 ms. 0.5 KN/s
perft(1) = 36 in 0 ms. ∞ KN/s
perft(2) = 965 in 1 ms. 965.0 KN/s
perft(3) = 44,307 in 51 ms. 868.8 KN/s
perft(4) = 1,749,084 in 826 ms. 2,117.5 KN/s
perft(5) = 93,269,761 in 51,079 ms. 1,826.0 KN/s
perft(6) = 4,493,115,467 in 2,292,387 ms. 1,960.0 KN/s
So, one of us is wrong. ;)
I fixed how I was counting positions that terminate early (by not counting them) and now we agree.
depth count time kn/s
0 1 2.1µs 484.3
1 36 5.8µs 6195.1
2 965 47.0µs 20540.2
3 44307 1.7ms 26532.8
4 1749084 78.8ms 22205.8
5 93269761 4.1s 22622.1
6 4493115467 531.9s 8447.2
Thanks for your work bootstrapping a Hive AI community!
I have independently verified your perft counts with all extension combinations (except the one that takes multiple days to run) with my Hive engine. When I can come up with a name, I'll open source it ;)
[Are the performance numbers supposed to be single-threaded? Mzinga.Perft is taking 1.8 cores when I run it.]
Can we add some perft numbers from other board positions? There are a lot of edge cases that probably can't be exercised in the first several moves of the game. Especially involving some beetle gates, mosquito and pillbug interactions, etc. For performance evaluations, larger board states enumerating ant movements would be a good stress-test for most engines.