lightvector / KataGo

GTP engine and self-play learning in Go
https://katagotraining.org/
Other
3.56k stars 564 forks source link

Mirror Go Weakness #163

Open friedrichren opened 4 years ago

friedrichren commented 4 years ago

KataGo does not seem to have implemented any heuristics countering mirror go under games without komi where the first move is on Tengen. Obviously this is of extremely low priority, but that aside is @lightvector interested in ever implementing this?

lightvector commented 4 years ago

I might be interested. Have Leela Zero or other open source bots found effective ways of doing this?

isty2e commented 4 years ago

I am not sure if KataGo is vulnerable to the mirror go strategy at all. Has it been tested?

friedrichren commented 4 years ago

I am not sure if KataGo is vulnerable to the mirror go strategy at all. Has it been tested?

Yes. It is vulnerable at least in games without komi where the first move is on the Tengen. I haven't done any proper testing so I only have 3 games but KataGo fails in 2 of them with RTX2060 and 10s for each move. That's petty high even for noise.

friedrichren commented 4 years ago

I might be interested. Have Leela Zero or other open source bots found effective ways of doing this?

Nothing as far as I know.

Countering black mirror go is of course much less important than trying to counter white mirror go, but I haven't done any proper testing, so I don't know if its problem with black mirror go also shows in white mirror go. I do have 3 games of white mirror with KataGo and in one of which it started a battle marching towards tengen early on and in all of which the battles in the center resulted in the tengen being occupied by black stone. I don't trust my judgement on these games but it seemed to me the battles were close enough to make me believe that KataGo handles better in white mirror go, although it still has some flaws because the winrate fluctuates even when more than half of the board is full. This needs more testing of course.

Back to black mirror go. Leela Zero or any other bot with fixed komi will of course have no incentive to beat black mirror go, since black mirror go only has an effect on the outcome when komi is 0, at least most of the time. If black almost always diverge early in the game then whether there is a strategy against mirror go is not as important, even in the most extreme case of self-play training, the bot can only be trained to the average point where black aborts mirroring.

With no intervention in self-play training KataGo will also fail at mirror go because the probability of black playing mirroring moves converges to 0 with increasing moves. So even though we can see KataGo predicting mirroring moves in its analysis, meaning there is a fair amount of self-play games involving a fair amount of mirroring moves, the frequency of such predictions decreases quickly. Since almost all unknown positions are equally likely to be explored by noise, it may take a lot more training for KataGo to be able to properly handle mirror go. There is a possible way to increase KataGo performance in both black mirror go and white mirror go:

  1. Suppose for a fraction P of training games, one of the side mirrors until move T, where T is randomly decided within a range to train KataGo to always predict the opponent using the mirroring move.
  2. The mirroring games only appear under appropriate komi values.
  3. Both P and T should be of reasonable scale so it would not affect KataGo's normal output.
  4. Possibly a counter to record how long one side has been mirroring. Maybe promote mirroring moves in search according to this counter.

I have no idea if this will work, how much it affects training efficiency and whether it pays off, or whether it will have profund unpredictable influence such as changing of KataGo's style to prefer playing mirroring moves, but it seems to be the more "elegant solution" than just forcing the opponent to abandon mirror go, if we consider solving the mirror go problem necessary.

Thank you for your work on this amazing project, anyway. KataGo is already perfect most of the time. If I haven't saw somebody on the Internet posting a video of him "defeating" KataGo with mirror go I probably won't think that maybe it's a nuisance that's worth solving. 🤦‍♂️

friedrichren commented 4 years ago

Zen7 has code that makes the bot play around the tengen when it detects mirror go (after a few moves). This is pretty close to KataGo's strategy to avoid Mi's dagger. Perhaps we can break mirror go early this way?

pdeblanc commented 4 years ago

How about augmenting the opponent policy model with a probability of (n + 1) / (n + 2) of mirroring in symmetric positions where the opponent has played n moves?

lightvector commented 4 years ago

I can report first-hand that modifying the policy in the kind of way you suggest doesn't work, although it is a very limited and partial piece of the puzzle. :)

However, this thread is pretty old and looks like I forgot about it when implementing some mild mirror handling. Yep - between the last time this thread had a post and now, KataGo has implemented some experimental logic for this. It is not foolproof and will still have a lot of problems, especially if you have a significant reverse komi, or even worse, if you have a free-placement handicap where black places multiple stones around tengen!

Such extreme cases are not going to be handleable without fundamentally changing the approach and possibly doing things during training too, but less extreme cases should be a little better with this logic (even if not perfectly consistent). You can try it if you like. The logic is disabled by default, but if you add antiMirror=True in the gtp config, you can enable it.