pasky / pachi

A fairly strong Go/Baduk/Weiqi playing program
http://pachi.or.cz/
GNU General Public License v2.0
514 stars 117 forks source link

Seki handling (selfatari detection bug?) #29

Closed pasky closed 6 years ago

pasky commented 8 years ago

Pachi has trouble with seki, because Pachi has buggy selfatari detection and possibly the nakade code.

Seki would be solved by just setting selfatarirate=100 or =99 in Moggy and debugging the playouts in lost positions, there ought to be just some bug there that prevents things to go correctly.

pasky commented 8 years ago

c.f. also the comment in playout/moggy.c: "Since some unclear point, selfatari 95 -> 60 gives a +~50Elo boost against GNUGo. This might be indicative of some bug, FIXME bisect?"

So that's another bug. But even selfatari=95 is too low to ensure proper seki handling (seki will be always eventually broken with selfatari <~99 in late playouts).

lemonsqueeze commented 8 years ago

I'm really puzzled why increasing selfatarirate hurts. All my attempts failed so far, even with a much better is_bad_selfatari() (from a tsumego point of view, tested on 3k positions from life&death problems).

The other day i came across this disturbing fuego page: https://sourceforge.net/p/fuego/wiki/SelfAtariExperiments/

Pretty convincing, but still i'm not entirely happy with this. This is so counterintuitive, there ought to be at least some classes of selfataris which are always harmful in playouts, like self-atariing a big group that can connect out...

Now that i think about it, having a 100% rule that prevents big groups self-ataris would fix seki, right ?

pasky commented 8 years ago

A hack that would invalidate Fuego's hypothesis might be to, when a player is forbidden self-atari, atari the concerned group in the next move by the opponent with some probability. This way, we would not trigger this problem of encouraging the losing player to push off large captures.

lemonsqueeze commented 8 years ago

I say it's worth trying =)

lemonsqueeze commented 8 years ago

Hi, I have an experiment that looks pretty good:

The idea was to try using a pretty basic is_bad_selfatari() in the playouts with 100% rate. It doesn't care about 1 or 2 stones business and just tries to prevent big blunders with large groups. So it's like a trimmed down version of current code, with an extra check for possible countercaptures which i think is missing right now. It's incredibly inefficient (in terms of spotting bad selfataris) but correctness is really good: 98% on my sar_tsumego.t vs 62% for current code.

I use this with 100% rate in moggy_permit() and also check moves returned by moggy_seqchoose() : Turns out some of them don't pass (!) Maybe this could be the bug you were looking for ? With these two seki is played out correctly and it looks only slightly weaker: 52% vs 54% winrate against gnugo on 15x15:

S GAMES    WINRATE   S.D.    PAIRING
. 2063       0.519   0.011   15--7.5-1-gnugo10-pachi_rbsar
/ 5255       0.539   0.007   15--7.5-1-gnugo10-pachi_origin

It can probably be improved still, just beginning to play with it :)

lemonsqueeze commented 8 years ago

So that's for the selfatari issue. For 3-stones sekis there's still the problem of not filling it early on while there are still outside liberties, otherwise group is dead. It doesn't look too hard to single case that out though. I started on a test which gets most of the way there, with this extra check in moggy_permit() sekis as in issue #39 are working now :

   Move:  145  Komi: 0.0  Handicap: 4  Captures B: 0 W: 0
     A B C D E F G H J K L M N O P Q R S T        A B C D E F G H J K L M N O P Q R S T
   +---------------------------------------+    +---------------------------------------+
19 | . . . . . . . . . . . . . . . . . . . | 19 | X X X X X X x o O O O O O O o O O O O |
18 | . . . . . . X O O O O . O O X . O . . | 18 | X X X X X X X O O O O O O O , o O O O |
17 | . . . . . . X X X X X O O X . X O . . | 17 | X X X X X X X X X X X O O , o , O O O |
16 | . . . X . . . . . . O X O X . X . O . | 16 | X X X X X X X X X X X X O , o , o O O |
15 | . . X . . . . . . X . X X O X . . . . | 15 | X X X X X X X X X X X X X O , o o o o |
14 | . . . . . . . . . X . X O O O . O . . | 14 | X X X X X X X X X X , X O O O o O , , |
13 | . . X . . . X . . . O O X O O . X O . | 13 | x X X X X x X x x , O O O O O , X , X |
12 | . . X O X . O . X . . . X X O X X X X | 12 | , , X X X , O , X , o O O O O X X X X |
11 | . . O X X . . . . . . O . O O O X . O | 11 | , , O X X , o , , , o O O O O O X X , |
10 | . . O O . . O . . . O X . X . O X X O | 10 | , o O O , , O o , o O , o , , O X X , |
 9 | . . . . . . . . . . O X . . . O X . O |  9 | , o O O o o O O o O O , o , , O X X , |
 8 | . . O . . . O O O O X O O . X X X X X |  8 | , , O O O O O O O O x O O , X X X X X |
 7 | . X O . . . . X X X X . . O O O O O . |  7 | X X O O O o , x x x x , O O O O O O , |
 6 | . X O . O . . . O)O O O O X . . X . . |  6 | X X O O O , , , , O O O O x , o , o , |
 5 | . X O . . X . . X X X . X . . X . . . |  5 | X X O , , X x x X X X , X x , , o O o |
 4 | . X O X . . . . . . . . . O O X O O . |  4 | X X O X x X X X X X X x x , , , O O O |
 3 | . . X . . X . . . . . X X O X O X . . |  3 | X X X X X X X X X X X X X , X O O O O |
 2 | . . . . . . . . . . . . . X X O O . . |  2 | X X X X X X X X X X X X X X X O O O O |
 1 | . . . . . . . . . . . . . . . . . . . |  1 | X X X X X X X X X X X X x x , o O O O |
   +---------------------------------------+    +---------------------------------------+

It's in my seki branch, needs optimizing though.

lemonsqueeze commented 8 years ago

Ok I made some progress. As in fuego findings, just having 100% selfatari rate breaks moggy's balance and it's the only thing that matters here: without correction it hurts a lot.

The problem was how do you look into a policy's balance ? I thought i could use fast 15x15 games for testing but it doesn't work here. I guess the policy isn't sampled enough so output is too random. I'm getting much better feedback playing a few self-play games at full strength, looking at the positions where winrates start doing strange things and trying to improve these.

So far i've got a few corrections which help quite a bit. The hack to invalidate fuego's hypothesis seems to work pretty well (!) : trying 2lib attacks where selfatari was prevented. Putting 3lib_suicide check back also helps a lot. Needs more testing but it's beginning to look good : latest version just won 2 games against origin !

lemonsqueeze commented 8 years ago

Hi,

I'm happy to say i have a seki branch with about same winrate as before. Took a lot more work than i thought, with lots of trial and error but it's finally there ! New code is a bit more expensive however i found a nice tactics infrastructure optimization which makes it about as fast as before. It's not a small patch though, pushed me to revisit parts of the tactics code i didn't know about, and still needs quite a bit of cleanup. Should be ready soon hopefully.

pasky commented 8 years ago

Awesome, looking forward to it!

lemonsqueeze commented 8 years ago

I've been hit by the hidden bug from hell again, might have nailed it this time though: I make some rather benign changes while cleaning up my branch, winrates suddenly drop inexplicably. I try play-testing again single threaded this time, winrate is back to normal...

I played with ThreadSanitizer some time ago, and what came out is a branch full of spinlocks and no warnings from TSan. Most of it is probably overkill but except for the extra memory used it doesn't hurt either. Anyway if i rebase this on the faulty code it works again multithreaded. It remains to be seen whether this happens because compiler has shifted things around or because it's a real fix, but at least it's a possibility.

lemonsqueeze commented 8 years ago

I'll fight with tsan another day. In the meantime here's my seki branch. Still some cleanup to do but should be manageable now. I'll create a PR tomorrow for discussing it. Cheers

lemonsqueeze commented 6 years ago

Fixed by PR #44