jonthysell / Mzinga

Open-source software to play the board game Hive.
MIT License
82 stars 9 forks source link

UHP: bestmove depth and iteration-based algorithms #123

Closed richardjs closed 1 year ago

richardjs commented 1 year ago

The UHP page says:

When using bestmove depth, MaxDepth is the number of plies (moves or turns) to look ahead, and should be a non-negative integer value.

Some algorithms, like MCTS, don't have the same concept of limiting by depth that (say) minimax has. They just perform iterations until done, with more iterations typically giving better playing strength. How should a UHP engine using that sort of algorithm handle the bestmove depth command, and how should UHP specify iteration limits to these algorithms?

The simple answer may be to use the depth value as the iteration count, so bestmove depth 50000 tells the algorithm to run 50k iterations. More generally, the depth argument could be defined as an engine-defined measure of strength, be it ply depth, iteration count, or something else.

Or you may prefer something else. What do you think?

jonthysell commented 1 year ago

Interesting. I suppose you implement it as you see fit, realistically the best (only?) way to judge playing strength between two players (AI or human) is with time. I wouldn't really use depth to compare engines (esp. with things like quiescence checks).

At some point, I'd want the UHP to be able to specify various time controls (see #43) so your engine could decide for itself how much time it wanted to search on each turn (potentially risking losing by timeout).

richardjs commented 1 year ago

OK! I've gone ahead and implemented bestmove depth 50000 meaning a 50K iteration limit (and I might tweak it to multiples of 1000, see richardjs/zoe#3). I'll pay attention to how the standard develops.

I agree they're not the ultimate metric for comparing engines, but comparisons based on depth/iterations are still useful for assessing improvements and relative strength (e.g. how often does depth 5 beat depth 3, or how does tweaking this parameter change playing strength) with less noise than a time limit. (But maybe it could be argued that's outside of scope for UHP, and should be left up to the individual engines.)

richardjs commented 1 year ago

Oh, and one other thought--max depth/iterations is also useful in human-to-engine comparisons. If I say "I can beat Mzinga with depth 5 but not depth 6", that's something reproducible by anyone, regardless of their processor strength. It gives a common point of reference--maybe you can beat it at depth 7, so you know you're a stronger player than I am, and have a idea of how to gauge my strength.

But if I say "I can beat Mzinga with 5 seconds but not 10 seconds", it's mostly meaningless to anybody without a very similar setup to mine. Even if I gave my system specs and you could accurately reproduce them (or extrapolate based on your experience), that's a whole lot more information to communicate then a single depth/iteration number.