lightvector / KataGo

GTP engine and self-play learning in Go
https://katagotraining.org/
Other
3.6k stars 569 forks source link

difference in winrate and scorelead between MoveInfos and RootInfo #922

Open cofdam opened 7 months ago

cofdam commented 7 months ago

Hello, Thank you for developing this excellent bot program.

I'm conducting a personal research project using KataGo and have encountered a problem for which I seek advice.

I have around 100,000 SGF files of Go matches from the 1940s. My objective is to employ KataGo’s analysis engine to determine the discrepancy in win rate and score lead between the moves recorded in these SGF files and KataGo's "best" move recommendations for each position.

Initially, I thought I could calculate this by subtracting the win rate from "rootInfo" from the win rate for the first recommended move ("MoveInfos", "Order" == 0). However, I've noticed that even when the move played matches KataGo's best recommendation, the win rates don't align.

For instance: image

In this example, the "best" prefix denotes variables from KataGo's first recommendation, while "root" pertains to the actual moves played. Here, despite the "Move" by white being "C15" – exactly what KataGo recommended as "bestmove" – "rootscoreLead" and "rootwinrate" differ from "bestscoreLead" and "bestwinrate".

Is this discrepancy by design, or might it indicate an error in my approach? If this is by design, could you provide any advice on a workaround?

For context, my analysis was conducted on Google Colab using a V100 GPU, with maxVisits set to 1000, and all SGF files were processed under identical settings. I utilized KataWrap for batch processing.

Thank you in advance.

lightvector commented 7 months ago

Hi cofdam!

This looks like intended behavior. Generally, in MCTS, the winrate or score or any other similar statistic about a node is based on a weighted average of all the moves available at that node, rather than solely that of the believed best move. For example, all else equal, a position in which every reasonable move leads to a 94-95% winrate may more likely to be truly winning than one in which only one move has a 95% winrate, especially if the search is not very deep and the apparent 95% winrate of that one move might be a standalone outlier.

The weighting increasingly concentrates on the believed best move as the node is searched more, and of course the root node has the most search of all, which is why you observe only a tiny difference. When applied throughout the whole search tree, especially at deep nodes without many visits, this overall works better in practice at producing strong play and analysis than weighting only the best move, although in places like the root node and when there are a lot of visits, the difference may not matter much.

There isn't anything to "workaround" here, rather it would be up to you to decide which winrate you wish to use for your purpose. The root winrate is the winrate computed for the root node in the same weighted average manner as all other nodes. The best move winrate is solely the winrate of the best move without averaging other moves at the root... but the best move winrate is still a weighted average of the winrates that could follow after the various opponent replies, rather than just the single best opponent reply, and each of those is a weighted average of the winrates that follow after the possible replies to the opponent reply, rather than only the best reply to the reply and so on.