SanchoGGP / ggp-base

The General Game Playing Base Package
8 stars 4 forks source link

Tree not being correctly synced on setting new root state in English Draughts #294

Closed SteveDraper closed 9 years ago

SteveDraper commented 9 years ago

In this game: http://www.ggp.org/view/tiltyard/matches/09391a39de403c437965a84b48a2f47e944f3972/ on turn 8 (and also some later turns) we didn;t seem to match up the tree correctly on setting the next root state. This log fragment shows that on setting a new root state after our opponent's move we threw away the entire tree apart from 1 node! :+1:

2015-06-17 08:51:55,078 INFO TreeNode Move noop scores 51.13 (selectionScore score 48.56, selection count 1005404, ref 8590007810, RAVE[0, 0.00]) 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move g 2 f 3 ) scores [55.69, 44.31], visits 551, ref : 17180670435, RAVE[352840, 47.16] 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move e 2 f 3 ) scores [55.35, 44.65], visits 610, ref : 17180670436, RAVE[749896, 48.80] 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move b 3 a 4 ) scores [52.99, 47.01], visits 2087, ref : 17180670439, RAVE[945356, 48.72] 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move b 3 c 4 ) scores [55.76, 44.24], visits 490, ref : 17181454242, RAVE[535742, 47.01] 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move d 3 c 4 ) scores [51.11, 48.89], visits 958314, ref : 17180670437, RAVE[209386, 48.21] 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move d 3 e 4 ) scores [56.02, 43.82], visits 425, ref : 17181454249, RAVE[193662, 47.01] 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move g 2 h 3 ) scores [51.34, 48.66], visits 42441, ref : 17181454308, RAVE[851079, 48.88] 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move f 5 e 6 ) scores [56.45, 43.55], visits 402, ref : 17181454243, RAVE[331674, 48.43] 2015-06-17 08:51:55,078 DEBUG TreeNode Response ( move f 5 g 6 ) scores [61.05, 38.95], visits 131, ref : 17180670429, RAVE[250757, 47.61] 2015-06-17 08:51:55,078 INFO TreeNode Most likely path: noop, ( move d 3 c 4 ) | ( move d 5 e 4 ), ( move e 2 f 3 ) | ( move f 7 g 6 ), ( move g 2 h 3 ) | ( move d 7 c 6 ), ( move b 3 a 4 ) | ( move c 8 d 7 ), ( move h 1 g 2 ) | ( move c 6 b 5 ), ( move d 1 e 2 ) | ( move h 7 g 6 ), ( move g 6 f 5 ), ( move a 2 b 3 ) | ( move h 5 g 4 ), ( move f 1 e 2 ) | ( move e 8 d 7 ), ( move b 1 a 2 ) | ( move d 7 c 6 ), ( move d 5 e 6 ) | ( move g 8 h 7 ), ( move e 6 d 7 ) | ( move h 7 g 6 ), 2015-06-17 08:51:55,078 INFO MCTSTree Num nodes in use: 2970001 2015-06-17 08:51:55,078 INFO MCTSTree Num true rollouts added: 327649 2015-06-17 08:51:55,078 INFO MCTSTree Num terminal nodes revisited: 63 2015-06-17 08:51:55,078 INFO MCTSTree Num incomplete nodes: 0 2015-06-17 08:51:55,078 INFO MCTSTree Num completely explored branches: 607 2015-06-17 08:51:55,078 INFO MCTSTree Current observed rollout score range: [0, 100] 2015-06-17 08:51:55,078 DEBUG GameSearcher Factor best move: noop 2015-06-17 08:51:55,078 INFO Sancho Playing move: noop 2015-06-17 08:51:55,078 DEBUG Sancho Move took: 42250 2015-06-17 08:51:58,210 DEBUG GameSearcher Dynamic sample size 2015-06-17 08:51:58,210 DEBUG GameSearcher Useful work last time: 32% 2015-06-17 08:51:58,210 DEBUG GameSearcher Calculated sample size: 2 2015-06-17 08:51:58,210 DEBUG GameSearcher Suppress update: false 2015-06-17 08:51:58,210 DEBUG GameSearcher Now using sample size: 1 (forced by use of RAVE) 2015-06-17 08:51:58,210 DEBUG GameSearcher Useful work total: 45% 2015-06-17 08:51:58,719 DEBUG RequestFactory GGP-Match-Step: 7 2015-06-17 08:51:58,719 DEBUG RequestFactory GGP-Match-Host: Tiltyard 2015-06-17 08:51:58,719 DEBUG RequestFactory GGP-Match-Players: 29756f66ce52a046aad9b3bccf89632a1df45d43, 13b04635bc91d3d1d57b89e9c2deaca5b62c56ac 2015-06-17 08:51:58,719 DEBUG RequestFactory GGP-Match-ID: tiltyard.englishDraughtsv0.1434548592463 2015-06-17 08:51:58,719 DEBUG Watchdog Watchdog won't fire until: 1434549718719 2015-06-17 08:51:58,719 INFO Sancho Moves played for turn 7: [( move g 2 f 3 ), noop] 2015-06-17 08:51:58,719 INFO Sancho Non-null move last turn was: ( move g 2 f 3 ) 2015-06-17 08:51:58,719 INFO Sancho Starting turn 8 2015-06-17 08:51:58,719 DEBUG Sancho Calculating current state 2015-06-17 08:51:58,719 DEBUG Sancho Setting search root 2015-06-17 08:51:58,719 INFO GameSearcher MCTS iterations last turn = 335935 2015-06-17 08:51:58,719 DEBUG GameSearcher Start move search... 2015-06-17 08:51:58,719 INFO TreeNode Free all but rooted in state: ( ( true ( cell d 7 black pawn ) ), ( true ( cell c 8 black pawn ) ), ( true ( lastToMove white ) ), ( true ( cell d 1 white pawn ) ), ( true ( cell a 2 white pawn ) ), ( true ( step 0 ) ), ( true ( cell a 8 black pawn ) ), ( true ( cell e 8 black pawn ) ), ( true ( cell g 8 black pawn ) ), ( true ( cell d 3 white pawn ) ), ( true ( cell a 6 black pawn ) ), ( true ( cell b 1 white pawn ) ), ( true ( cell h 5 black pawn ) ), ( true ( cell f 1 white pawn ) ), ( true ( cell f 3 white pawn ) ), ( true ( cell d 5 black pawn ) ), ( true ( cell h 1 white pawn ) ), ( true ( cell c 2 white pawn ) ), ( true ( cell b 3 white pawn ) ), ( true ( cell b 7 black pawn ) ), ( true ( cell e 2 white pawn ) ), ( true ( cell h 7 black pawn ) ), ( true ( cell f 7 black pawn ) ), ( true ( cell f 5 white pawn ) ) ) 2015-06-17 08:52:01,349 INFO TreeNode Freed 99% of allocated nodes (2970002 of 2970003)

SteveDraper commented 9 years ago

This is actually not serious and (in some sense) not even a bug. What is happening is this:

1) Node table gets full (many times during a turn) 2) Least likely selection predominantly selects low activity sub-trees, which inevitably;y also tend to be smaller. After a while it empties them out right back to the immediate child of the root 3) Opponent plays what we considered to be an unlikely move, which has been entirely trimmed away due to the above, so there is no useful tree left to connect to

There are some concerns here, but this symptom is not in and of itself a bug. The concerns are: i) Given we saw this against a strong opponent it probably wasn't really so very unlikely, so its probable that our assessment was a bit off ii) The trimming policy isn't very effective because it naturally tends to trim from small sub-trees, and hence is likely to empty them. However, we cannot tell reliably how large a sub-tree might be (after trimming has begun) because of transpositions, and because when we do trim we cannot update all ancestor paths to reflect one less descendant (so we actually don't bother updating any). Even if we could update the ancestors reliably we'd need a separate count field, because we still need to maintain the actual visit count (as opposed to the remaining-descendants count, which differ once trimming has started) to generate correct UCT values.

The current code tries to mitigate (ii) by adding deliberate noise (so it becomes probabilistic which sub-tree will be pruned from, with higher probability to low-likelihood for future selection branches), but this only delays the symptom (the smaller branches will still be the predominant targets). The code also used the (now discredited) caching of least-likely and 2nd-least-likely (analogous to what we used to do on selection) which makes choices stickier than they ought to be.

For now I have just removed the discredited caching mechanism, but eventually we're going to have to come up with a better trimming policy