lightvector / KataGo

GTP engine and self-play learning in Go
https://katagotraining.org/
Other
3.56k stars 564 forks source link

[need KataGo GTP Original Document] final_score different from printsgf_RE #596

Open HackYardo opened 2 years ago

HackYardo commented 2 years ago
showboard
= MoveNum: 30 HASH: BB7F884B54E46AAB0B3CA7BD3AA29DDE
   A B C D E F G
 7 . . . . . . .
 6 O O O X O . .
 5 X X X O O O1.
 4 . X . X X O .
 3 . . X O X O .
 2 . . X . X O O
 1 . . . . X X O
Next player: Black
Rules: {"friendlyPassOk":true,"hasButton":true,"ko":"POSITIONAL","komi":7.0,"scoring":"AREA","suicide":false,"tax":"NONE","whiteHandicapBonus":"N"}
B stones captured: 0
W stones captured: 1

printsgf
= (;FF[4]GM[1]SZ[7]PB[]PW[]HA[0]KM[7]RU[koPOSITIONALscoreAREAtaxNONEsui0button1whbNfpok1]RE[W+5.5];B[dd];W[cd];B[ce];W[de];B[cf];W[dc];B[ed];W[ec];B[cc];W[fd];B[ee];W[cb];B[bc];W[fe];B[ef];W[bb];B[bd];W[ff];B[fg];W[ab];B[ac];W[gf];B[db];W[eb];B[eg];W[gg];B[];W[fc];B[];W[]C[result=W+5.5])

final_score
= W+7.5

7.5 != 5.5

lightvector commented 2 years ago

Thanks for reporting this! What is going on is that "final_score" uses a smart evaluation that requires querying the neural net to determine dead stones and predict the final territories. Whereas the code generating an SGF which is called by "printsgf" doesn't use the neural net right now, only strict tromp-taylor-like scoring. Under strict tromp-taylor scoring, the dead stones need to be captured, or else they prevent the territory from being scored, and so then you get the 5.5 result above.

I'll think about how to fix this. It's a bit tricky, because at the point of trying to generate an SGF, the code no longer has access to the neural net or to KataGo's search algorithm to try to determine dead stones. In the meantime, your options are either:

or

KataGo also has a GTP extending document, but how does KataGo execute GTP original commands?

KataGo should execute them them the way that all other properly-written Go programs should execute GTP commands. Are you simply asking for a link to the original GTP protocol spec, or are you asking something else? https://www.lysator.liu.se/~gunnar/gtp/

EZonGH commented 2 years ago

The "rules" line and the sgf KM field are showing komi of 7.0 points and 7 points respectively. Why are the two competing scores both half-point scores instead of round integers?

HackYardo commented 2 years ago

@lightvector I don't think it is necessary to print smart score into sgf file by KataGo itself, the work can be done by GUIs. If there is nothing left, I will close this issue.

@EZonGH See Button on this page for the 0.5.

. . . . . . .
O O O X O . .
X X X O O O .
. X X X X O .
. . X O X O .
. . X . X O O
. . . . X X O
B = 14
W = 13
S = 13-14+7-0.5(because B passed first)
RE = W+5.5
monsterkodi commented 2 years ago

Hello, I stumbled upon this problem today.

I was expecting gnugo and katago to return the same final_score after i let them play against each other. But they don't.

gnugo has a command called estimate_score. I think that is a better term for any AI generated score.

It is also strange that the resulting scores are so far off between gnugo and katago at the final stage of a game. I would have thought that the score estimates get closer and closer to the final result the more the game progresses.

monsterkodi commented 2 years ago

This is an example 9x9 game:

showboard
MoveNum: 48 HASH: 6789C0112DF7F7B36E6B6F869651E0FB
   A B C D E F G
 7 X O O O O . O
 6 X X X X O O O
 5 . . X X X O O
 4 . X X X X X O
 3 X . X O O O O1
 2 . X X X O . .
 1 . O X O O O .
Next player: Black
Rules: {"friendlyPassOk":false,"hasButton":false,"ko":"POSITIONAL","komi":0.0,"scoring":"AREA","suicide":true,"tax":"NONE","whiteHandicapBonus":"0"}
B stones captured: 2
W stones captured: 3
final_score 
W+1.0

gnugo does report a B+5 score for the same game, which i think is the better behaviour. Note that friendlPassOk=false isn't really solving the problem, if the other client doesn't or can't do it.

lightvector commented 2 years ago

@monsterkodi - your problem is not the same problem as the one mentioned in this issue, and so far the behavior you've reported looks correct, it is only your configuration that needs to change. You need to set friendlyPassOk to be true, not false.

Basically, friendlyPassOk = false means you are saying that the rules require capturing stones like B1. Any stones on the board that are not pass-dead (B1 is not pass-dead) when the game reaches the end are mandated by the rules to be treated as alive for scoring. If those are the rules, then W+1 is the correct score.

If you want B+5, then you need to set friendlyPassOk = true, which says it's okay to pass and end the game without capturing stones like B1, and still treat them as dead.

monsterkodi commented 2 years ago

Thanks @lightvector for the quick reply and the explanation! You are right, i haven't configured it until now. I am sorry for that. I could get it to score like gnugo with scoringRule = TERRITORY. Thanks for your help and the nice go program!

HackYardo commented 2 years ago

I try the v1.11, it seems that final_score is stable if the model is big one. However, the printsgf is unstable, see below:

showboard
= MoveNum: 15 HASH: 7B6063B15B8C57F0F79B64086B61D500
   A B C D E
 6 . X . . .
 5 . O X O .
 4 . O1X O .
 3 . X O . .
 2 . X O . .
 1 . . X . .
Next player: White
Rules: {"friendlyPassOk":true,"hasButton":true,"ko":"POSITIONAL","komi":7.0,"scoring":"AREA","suicide":false,"tax":"ALL","whiteHandicapBonus":"N-1"}
B stones captured: 0
W stones captured: 0

final_score
= W+32.5

printsgf
= (;FF[4]GM[1]SZ[5:6]PB[]PW[]HA[0]KM[7]RU[koPOSITIONALscoreAREAtaxALLsui0button1whbN-1fpok1];B[cc];W[cd];B[bd];W[dc];B[cb];W[ce];B[be];W[db];B[cf];W[bb];B[ba];W[bc];B[];W[])

play b pass # the game is over now
=

printsgf # TL;DR 34.5
= (;FF[4]GM[1]SZ[5:6]PB[]PW[]HA[0]KM[7]RU[koPOSITIONALscoreAREAtaxALLsui0button1whbN-1fpok1]RE[W+34.5];B[cc];W[cd];B[bd];W[dc];B[cb];W[ce];B[be];W[db];B[cf];W[bb];B[ba];W[bc];B[];W[];B[]C[result=W+34.5])

final_score
= W+32.5

printsgf # TL;DR 32.5
= (;FF[4]GM[1]SZ[5:6]PB[]PW[]HA[0]KM[7]RU[koPOSITIONALscoreAREAtaxALLsui0button1whbN-1fpok1]RE[W+32.5];B[cc];W[cd];B[bd];W[dc];B[cb];W[ce];B[be];W[db];B[cf];W[bb];B[ba];W[bc];B[];W[];B[]C[result=W+32.5])

As you can see, the printsgf is not stable and will jump between 32.5 and 34.5 if you repeat it for many times.