Improve our Unguided Queens performance

arr28 commented 9 years ago

For the legals-guided queens puzzle, 31x31 is approximately the limit of our reach. However, for the unguided version, it's just 6x6. But they're the same puzzle! We should do better at the unguided version.

Off the top of my head, I reckon the most promising approach is identifying the (many) pairs of proposition latches which constitute goal latches. That would effectively convert the unguided into a legals-guided version, but performing a little poorer because of the need to expand states to discover the latched failure. We could do even better by going down the "convert state-heurstic to action-heuristic" route (#365) and actually prune the legals which would be known to lead to a zero score.

arr28 commented 9 years ago

Well, this is going to be interesting. I set out to do latch-pair analysis for the queens problem. However, due to the rule formulation, we don't even spot that cell x y queen (hereafter Q), is a positive latch.

Identifying that Q is a latch is going to be really tricky.

Q is fed by an or of the relevant does place (which is how it becomes true in the first place) with an and. The and's inputs are Q and a NOT of does place x y * - i.e. the cell retains its value unless placed on. legal place x y queen is fed by cell x y blank which itself is fed by an or on INIT (all cells are initially blank) and an and that the cell is already blank and hasn't just been played on.

So to see that Q is a latch will involve reverse propagation to deduce that does place x y queen must have been played at some point in the past and, forward propagating from there, that cell x y blank can no longer be true. After that it's "just" careful propagation through the network (incl. legal/does links) to demonstrate that Q is a latch.

I was already writing a TristatePropNet to help with the latch-pair analysis. It looks like that'll be necessary even for the simple positive latches in this case.

arr28 commented 9 years ago

Latest go at tristate back-propagation still didn't work. I think, when trying to prove a positive latch, we need to assume...

The proposition was FALSE in the preceding step (level 0)
The proposition is TRUE in the current step (level 1)
Derive that the proposition is TRUE in the subsequent step (level 2)

We can only find the queen latch by noticing that "place x y" must have been played and therefore the cell is no longer blank. If we don't assume the proposition was false in the preceding step, we don't see that the "place x y" must have been played and so we miss out on all the deductions.

arr28 commented 9 years ago

Still issues.

Invalid assumptions

Firstly, the above assumptions aren't quite right.

By assuming the proposition was FALSE at step 0 and then TRUE at step 1, even if we find that it's TRUE in step 2, that isn't proof of a latch. It just shows that after it changes state, it remains true for at least one more turn. The turn marker in Majorities follows this pattern (but isn't a latch).

Also, sometimes this is an invalid set of assumptions, leading to invalid conclusions. For example, in Queens, assuming that a blank proposition was FALSE and then becomes TRUE is an invalid assumption - which, as it happens, leads to the erroneous conclusion that blank is a positive latch.

Failing to identify +ve Q latch

The next problem is that it's still failing to find the +ve queen latch. It's getting very close now...

22:27:39.671 - Checking whether ( true ( cell 2 3 queen ) ) is a +ve latch
22:27:39.671 - ( does robot ( place 2 3 ) )has become TRUE in turn 0 by backward prop
22:27:39.671 - ( true ( cell 2 3 blank ) )has become FALSE in turn 1 by forward prop
22:27:39.672 - ( legal robot ( place 2 3 ) )has become FALSE in turn 1 by forward prop
22:27:39.672 - ( does robot ( place 2 3 ) )has become FALSE in turn 2 by forward prop
22:27:39.672 - ( true ( cell 2 3 blank ) )has become FALSE in turn 2 by forward prop
22:27:39.672 - ( legal robot ( place 2 3 ) )has become FALSE in turn 2 by forward prop

Firstly, it sees that the cell must have been played in the previous turn. Then it can tell that the cell is no longer blank now. As a result, it can tell that it's no longer legal to play on that cell (in either the current or next states - nor can the corresponding does prop be true in the next turn).

However, it still fails to see the latch. I need to investigate why. (One possibly additional assumption that could be made is that when a DOES is known to be TRUE, all the others are known to be FALSE. I don't think that's necessarily going to help here though.)

Asserting on -ve Q latch

Whilst testing out the -ve Q latch hypothesis, we fail an assert. That's probably a result of the invalid assumption (because we assume that Q was true and is now false - which can't ever happen.) Once the code is debugged, perhaps need to convert the asserts into IllegalAssumptionException which just aborts the attempt to find a latch.

arr28 commented 9 years ago

I wonder if the bug is to do with when a does prop. becomes true. Is it at the end of the old state or the start of the new state? This always does my head in.

SteveDraper commented 9 years ago

It's not a meaningful question. We only ask about the new state after BOTH the old state input and correct DOES state (one set previous reset) is established. Intermediate states of the propnet are not queried.

arr28 commented 9 years ago

Thanks. I've just been over it all carefully, with diagrams and everything. I think I've discovered that the TristatePropNet inference engine is correct. However, it still isn't rich enough to discover the latch :cry:.

The missing piece is that it needs to identify that, at the end of turn 0 (when it has identified that does place Q must have been played), legal place Q is no longer true. I think that would then be sufficient. However, legal place Q is set directly from the corresponding blank and (in turn 0) it's state is unknown.

Even if I extend the network back for another turn and add the inference that if a does prop is true in one turn then the corresponding legal must have been true previously - I still don't think that'll do the trick.

SteveDraper commented 8 years ago

I think I have fixed the Tristate latch detection (at least it now gives correct results in both the simple TicTacToe and the unguided queens puzzles). The two main changes are: 1) Legal -> FALSE => does == FALSE in the SAME turn (NOT the following turn) [think about it!] 2) The set up for the latch check of assuming a transition of the prop between turn 0 and turn 1 can result in a contradiction (e.g. - assume a +ve latch transitions negatively) - this case is now trapped and handled correctly

Example log from unguided queens:

13:29:04.776 - Checking whether ( true ( cell 3 5 queen ) ) is a +ve latch 13:29:04.776 - ( does robot ( place 3 5 ) ) has become TRUE in turn 0 by backward inference 13:29:04.776 - ( true ( cell 3 5 blank ) ) has become FALSE in turn 1 by forward inference 13:29:04.776 - ( legal robot ( place 3 5 ) ) has become FALSE in turn 1 by forward inference 13:29:04.776 - ( does robot ( place 3 5 ) ) has become FALSE in turn 1 by forward inference 13:29:04.777 - Yes - it's a positive latch 13:29:04.777 - ( true ( placing 1 ) ) has become FALSE in turn 1 by forward inference 13:29:04.777 - ( true ( placing 2 ) ) has become FALSE in turn 2 by forward inference 13:29:04.777 - ( true ( placing 1 ) ) has become FALSE in turn 2 by forward inference 13:29:04.777 - Checking whether ( true ( cell 3 5 queen ) ) is a -ve latch 13:29:04.777 - ( does robot ( place 3 5 ) ) has become FALSE in turn 0 by backward inference 13:29:04.777 - true->false transition of ( true ( cell 3 5 queen ) ) generates a contradiction, so discounting as potential -ve latch

Having done this the unguided queens STILL see no GOAL latches however, but I'm hoping that now the base problem of single latch detection is solved (I hope!), Andrew can complete the work he had intended to base upon it...

arr28 commented 8 years ago

Queen goal latch detection now working. Not integrated with the rest of Sancho though.

arr28 commented 8 years ago

Integration done. I'm trying to figure out whether I might have caused a serious performance regression along the way (or whether it's just a side-effect of the pair-latches causing greedy rollouts in the queens puzzle, which in turn causes it to be solved v. quickly - i.e. no problem). Hampered by #377.

arr28 commented 8 years ago

This basically works now. We reliably solve the (8x8) unguided queens puzzle. #380 covers a possible follow-on enhancement.

Closing.

SanchoGGP / ggp-base