LeelaChessZero / lc0

The rewritten engine, originally for tensorflow. Now all other backends have been ported here.
GNU General Public License v3.0
2.44k stars 528 forks source link

Possible improvement to centipawn output? #728

Closed yuzisee closed 5 years ago

yuzisee commented 5 years ago

In TCEC Superfinal 14, Leela's cenitpawn output seemed a little wonky to most observers (mostly that it's too large)

The formula proposed in https://www.chessprogramming.org/Pawn_Advantage,_Win_Percentage,_and_Elo could reduce the absolute magnitude of centipawns and might make for improved readability with traditional engine ranges / viewer expected ranges.

Right now Leela uses uci_info.score = 290.680623072 * tan(1.548090806 * edge.GetQ(0)); from https://github.com/LeelaChessZero/lc0/blob/7e9190a711d570367193cf068bbfcb9e32d3f868/src/mcts/search.cc#L127 which matches https://github.com/LeelaChessZero/lc0/wiki/Technical-Explanation-of-Leela-Chess-Zero so the suggestion would be to try something like `uci_info.score = -400.0 log10((1 / (edge.GetQ(0) 0.5 + 0.5) - 1)

oscardssmith commented 5 years ago

Attached is a visualization of the 2 results y axis cp, x axis, win pct. Blue is the new proposal. screenshot_2019-02-11 desmos graph

bunkbail commented 5 years ago

Blue in that graph looks great!

bunkbail commented 5 years ago

Larry Kaufman of Komodo mentioned a similar thing about lc0 current implementation of centipawn conversion on Talkchess:

Nothing wrong with converting win probs to scores in pawn units, we do the same in Komodo MCTS. But we don't output nonsense scores like this. Looks like no one bothered to check whether the conversion formula made any sense.

http://talkchess.com/forum3/viewtopic.php?f=2&t=69896

Prcuvu commented 5 years ago
(Q+1)/2 cp (old) cp (new)
0.50 0.00 0.00
0.55 0.45 0.35
0.60 0.93 0.70
0.65 1.46 1.08
0.70 2.07 1.47
0.75 2.84 1.91
0.80 3.89 2.41
0.85 5.49 3.01
0.90 8.42 3.82
0.95 16.20 5.12
1.00 128.00 +inf

It looks a little bit linear. For TCEC it takes 99.68% win rate to hit 10.00 winning score. Additionally, +inf is not good to display as score and should be thresholded.

alreadydone commented 5 years ago

Q=1.00 won't happen unless a win is found (by certainty propagation), barring rounding error, right? In that case M+xx can be displayed instead of +inf. However a cp value larger than say 1000 could happen and maybe we should cap it at 128 still.

oscardssmith commented 5 years ago

Now that this was merged, close?