lightvector / KataGo

GTP engine and self-play learning in Go
https://katagotraining.org/
Other
3.37k stars 553 forks source link

Per-intersection value configuration #332

Open pdeblanc opened 3 years ago

pdeblanc commented 3 years ago

I think this is a pretty big ask, but I'd like to be able to configure, e.g. in the game rules, a different point value for each intersection on the board. When the final score is evaluated, each intersection would score its specified point value for whichever player owns it.

This could be useful for analysis, e.g. to ask whether it's possible to save a particular group, or to find the highest-scoring endgame sequence in a particular area.

lightvector commented 3 years ago

Yes, the evaluation of the position in KataGo is done via neural net. The current neural net has no experience with this. So doing this would require training a neural net with a new input indicating the map of what board areas were worth how many points, possibly training from scratch.

A tricky part as well is it is best if the situations and shapes a neural net encounters in training are at least loosely "representative" of the situations and shapes it will encounter in actual usage. This means that you would also have to develop automated code to initialize the point scoring maps in ways in self-play that would be "representative" of how people would use them - e.g. randomizing the value of every intersection independently would probably not work so well in real actual use people would use maps that look very clustered (e.g. all of this entire region but none of that entire region) and non-random.

So basically, this is not going to happen unless someone wants to implement it and spend the time to re-train almost from scratch.

You might instead be interested in the allowMoves and avoidMoves options, documented in these places: https://github.com/lightvector/KataGo/blob/master/docs/GTP_Extensions.md https://github.com/lightvector/KataGo/blob/master/docs/Analysis_Engine.md

A big caution however for one of your aims: "highest scoring endgame sequence in a particular area" has a massive potential to be misleading and produce bad results. For example:

image

Technically W3 here is the "highest scoring endgame sequence" in this area, if nothing else matters but points in this region alone. You can probably see the issue here - simply finding the maximum local score achievable by alternating play does not take into account sente and gote, or sente and gote followups, which are massively important in the endgame. If you are at all experienced, they are probably baked in to your intuition now even in tons of situations where you don't realize it, like this simple second-line hane. In almost all realistic game situations, W3 is a big mistake compared to simply connecting, which any beginner is very quickly taught.

Anyways that said, do allowMoves and avoidMoves give you some ability to do what you thinking of? If you're an app or analysis tool developer, please feel free to try them out!

pdeblanc commented 3 years ago

OK, I'm aware of the gote issue, and that this would require expensive training. I'll try experimenting with allowMoves for now.