yaneurao / YaneuraOu

YaneuraOu is the World's Strongest Shogi engine(AI player) , WCSC29 1st winner , educational and USI compliant engine.
GNU General Public License v3.0
512 stars 140 forks source link

Wrong evaluation #264

Closed WandererXII closed 1 year ago

WandererXII commented 1 year ago

Hello,

This was reported to me: https://github.com/WandererXII/lishogi/issues/661. I narrowed it down to this, compare these instructions, although they lead to the same position, the evaluation is different.

A position sfen l5snl/3k5/1pnsg1b2/p1ppp1g1p/5pp2/PPP1P4/2SP5/K1GB5/LNG4+r1 w 2Prsnl3p 78 moves 4e4f 6h4f P*4e results in:

info depth 1 seldepth 1 score cp -4966 nodes 187 nps 187000 time 1 pv 4f6h N*8d 5f5e 7c6e 5e5d 6c5d P*5e 5d5e P*5i 6b6c P*4d 3c4d 8f8e 6e7g+ 6h7g L*9g 8i9g 9d9e 7g9e 9a9e 8e8d 2i5i
info depth 2 seldepth 2 score cp 31111 nodes 525 nps 525000 time 1 pv 4f6h N*8d 5f5e 7c6e 5e5d 6c5d P*5e 5d5e P*5i 6b6c P*4d 3c4d 8f8e 6e7g+ 6h7g L*9g 8i9g 9d9e 7g9e 9a9e 8e8d 2i5i
info depth 3 seldepth 2 score cp 31111 nodes 824 nps 824000 time 1 pv 4f6h N*8d 5f5e 7c6e 5e5d 6c5d P*5e 5d5e P*5i 6b6c P*4d 3c4d 8f8e 6e7g+ 6h7g L*9g 8i9g 9d9e 7g9e 9a9e 8e8d 2i5i
...
info depth 57 seldepth 2 score cp 31111 nodes 195599 nps 2126076 time 92 pv 4f6h 3c7g+ 8i7g 2i7i 7h7i
bestmove 4f6h ponder 3c7g+

B While this: position sfen l5snl/3k5/1pnsg1b2/p1ppp1g1p/6p2/PPP1Pp3/2SP5/K1GB5/LNG4+r1 b 2Prsnl3p 79 moves 6h4f P*4e results in:

info depth 1 seldepth 1 score cp -5384 nodes 180 nps 180000 time 1 pv P*4i 4e4f P*2b 3c7g+ 7h7g
info depth 2 seldepth 2 score cp -5086 nodes 2578 nps 644500 time 4 pv 4f5g 2i2c 5g7e 7d7e
...
info depth 12 seldepth 15 score cp -5062 nodes 50116 nps 849423 time 59 pv 4f6h N*8d 5f5e 7c6e P*4i 2i4i 5e5d 6e7g+ 5d5c+ 6b5c 6h7g 3c7g+ 7h7g 4i7i B*7a G*6b
info depth 13 seldepth 16 score cp -4892 nodes 78703 nps 894352 time 88 pv 4f6h N*8d 5f5e 7c6e 5e5d 6c5d P*5e 5d5e P*5i 6b6c 5i5h 6e7g+
bestmove 4f6h ponder N*8d

Here are images to better see it: A pos_wrong

B pos1_correct

Reproducible with https://github.com/mizar/YaneuraOu.wasm (k-p) and locally on linux I tested with compiled from source version 7.63 (https://github.com/WandererXII/shoginet/blob/main/build-yaneuraou.sh), as eval NNUE file I used suisho (https://drive.google.com/drive/folders/1FuGMsVHfKvjTcTLdVm3K4Q3sxemh5S-B, https://drive.google.com/file/d/1ESoJ30bE1pblUSkjNznT8B6W04GJeL3Y/view and 水匠3(201231)).

Can I ask for advice? I'm not very familiar with inner workings of engines, so is it a bug, wrong eval file, or am I doing something incorrectly?

Thanks

yaneurao commented 1 year ago

This is the correct evaluation value.

Please see the following article : 31111

WandererXII commented 1 year ago

Thank you for your answer, I should have noticed that.

Is there some way to force it to output the 'proper' evaluation, so I can avoid this (the spike is the 31111 evaluation): image

yaneurao commented 1 year ago

Since this is a superior position, I believe this is the correct and proper evaluation value.

However, if you do not want to output this value, you can either not output this when the internal evaluation value is VALUE_SUPERIOR, or you can output the previous evaluation value.

for example, in usi.cpp(204):

    ss  << "info"
        << " depth "    << d
        << " seldepth " << rootMoves[i].selDepth
#if defined(USE_PIECE_VALUE)
        << " score "    << USI::value(v)
#endif
        ;

->

    ss  << "info"
        << " depth "    << d
        << " seldepth " << rootMoves[i].selDepth;
#if defined(USE_PIECE_VALUE)
        if (v != VALUE_SUPERIOR)
            ss  << " score "    << USI::value(v)
#endif
        ;
WandererXII commented 1 year ago

Thank you. I'm not familiar with the problematic enough to understand why you need to define 'superior position' as a special case, but from the outside it seems odd, since positive cp here should indicate gote is winning, while it's not the case.

I seem to be able to get around it by using only 'sfen' with no 'moves'.

Instead of doing this: position sfen lnsgkgsnl/1r5b1/ppppppppp/9/9/9/PPPPPPPPP/1B5R1/LNSGKGSNL b - 1 moves position sfen lnsgkgsnl/1r5b1/ppppppppp/9/9/9/PPPPPPPPP/1B5R1/LNSGKGSNL b - 1 moves 3i4h position sfen lnsgkgsnl/1r5b1/ppppppppp/9/9/9/PPPPPPPPP/1B5R1/LNSGKGSNL b - 1 moves 3i4h 3c3d position sfen lnsgkgsnl/1r5b1/ppppppppp/9/9/9/PPPPPPPPP/1B5R1/LNSGKGSNL b - 1 moves 3i4h 3c3d 9g9f ... I would do this: position sfen lnsgkgsnl/1r5b1/ppppppppp/9/9/9/PPPPPPPPP/1B5R1/LNSGKGSNL b - 1 position sfen lnsgkgsnl/1r5b1/ppppppppp/9/9/9/PPPPPPPPP/1B3S1R1/LNSGKG1NL w - 2 position sfen lnsgkgsnl/1r5b1/pppppp1pp/6p2/9/9/PPPPPPPPP/1B3S1R1/LNSGKG1NL b - 3 position sfen lnsgkgsnl/1r5b1/pppppp1pp/6p2/9/P8/1PPPPPPPP/1B3S1R1/LNSGKG1NL w - 4

Is it safe to assume the performance isn't affected?

Thank you again for your time and your work on YaneuraOu.

yaneurao commented 1 year ago

About superior position

Historically speaking, it was the Shogi AI named Bonanza that first introduced the idea of superior position.

For example, suppose your opponent makes a pawn in front of your gold and backs up your gold, your opponent advances the pawn, and you take the pawn with your gold.

At this time, the pieces on the board remain the same as in the original position, and only the pawn have increased. This kind of situation is called a superior position.

Superior position during the search

If a superior position is encountered during the search, the search can be terminated there. Because we can say that you will definitely gain from it.

In fact, there may be other moves that are more beneficial, but at least we can say that the move with a pawn is an absolute bad move, as described above, and if this move was generated during the search, it should be eliminated.

Why do we need these special values?

So, the search section cannot return the usual evaluation value in this case.Because the search is terminated there.

The search engine needs to inform the GUI of this, but since such a thing is not specified in the USI protocol, a special value is needed here.

That is 31111, internally 28000, which means that it has encountered a superior position during the search. Old-time users of Shogi AI know that this comes from encountering a superior position.

Is it safe to assume the performance isn't affected?

There is no performance problem, but as a rule of Shogi, a repetition-move in consecutive check is a foul loss when the fourth identical position is reached, and if the player is currently on the third identical position, the thinking engine may return the move (foul move) that leads to that fourth move if the player has not given historical moves to get there.

I seem to be able to get around it by using only 'sfen' with no 'moves'.

So it is the wrong solution.

yaneurao commented 1 year ago

If you do not want the value of the superior position to be output, I recommend that you delete the following lines and rebuild.

position.cpp(2053):

else {
    // 優等局面か劣等局面であるか。(手番が相手番になっている場合はいま考えない)
    if (hand_is_equal_or_superior(st->hand, stp->hand))
        return REPETITION_SUPERIOR;
    if (hand_is_equal_or_superior(stp->hand, st->hand))
        return REPETITION_INFERIOR;
}
peanatsu commented 1 year ago

consider the following position:

sfen 6lll/6pnk/8p/7P1/7S1/9/9/7NP/6NGK b SP2r2b3g2snl13p 1

lishogi link: https://lishogi.org/analysis/standard/6lll/6pnk/8p/7P1/7S1/9/9/7NP/6NGK_b_SP2r2b3g2snl13p_1

sente to play:

lishogi

sente has S'2c (silver drop to 2c) which is mate in this position. if, however, sente is stupid and promotes his pawn, gote takes and sente drops his pawn again: P2c+ Kx2c P'2d

in that position gote's king is in check and has precisely 2 legal moves: K1b which is a terrible blunder (because it will get mated with S'2c) or K3c, which gives gote a winning position.

YaneuraOu gives as best move the terrible blunder K1b: lishogi

this doesnt just happen in lishogi, of course. you get the same result in shogidokoro for example. and it's independent of the evaluation file used. If you want to try yourself:

position sfen 6lll/6pnk/8p/7P1/7S1/9/9/7NP/6NGK b SP2r2b3g2snl13p 1 moves 2d2c+ 1b2c P*2d

this is a simplified example, but you can imagine YaneuraOu going onto a bad square with its king because it missed some deep tactic, but as this example shows, if given the opportunity to fix its earlier mistake (because the opponent didnt see the win and instead repeated) , YaneuraOu will instead choose to blunder again.

so what's wrong? I believe the problem is here:

yaneurao wrote:

If a superior position is encountered during the search, the search can be terminated there. [...]

there may be other moves that are more beneficial, but at least we can say that [...] if this move was generated during the search, it should be eliminated.

note the highlighted part. The move P'2d that created the opportunity for the superior-position-repetition was played on the board by sente (and of course P2c+ instead of S'2c was a terrible blunder). it was not generated during search of YaneuraOu, instead it was played on the board.

Btw, you mention that this technique was first introduced with bonanza. I gave bonanza this position and bonanza doesnt blunder like YaneuraOu, but instead goes with the king onto a safe square.

peanatsu commented 1 year ago

Here is another example:

sfen kl7/1l7/Pn7/S8/9/9/9/7PP/6NLK b SP2r2b4g2s2nl14p 1

lishogi link: https://lishogi.org/analysis/standard/kl7/1l7/Pn7/S8/9/9/9/7PP/6NLK_b_SP2r2b4g2s2nl14p_1

sente to play: lishogi

In that position sente has 2 ways to checkmate: S'9b, or sacrificing and redropping a pawn first: P9b+ Kx9b P'9c K9a S'9b

The second checkmate involves a superior-position-repetition (which, importantly, is not against the rules of shogi), and therefore YaneuraOu fails to find mate once you sacrifice your pawn with P9b+

position sfen kl7/1l7/Pn7/S8/9/9/9/7PP/6NLK b SP2r2b4g2s2nl14p 1 moves 9c9b+ 9a9b
go nodes 1000000

returns a score of -5000 (completely lost), even though sente has a simple mate in 3 moves. and of course the best move given by YaneuraOu is some random losing move, not the correct mating move P'9c

WandererXII commented 1 year ago

Thank you for your detailed explanation yaneuraou.

Removing both of these else clauses fixes my problem and also fixes the issues peanatsu found, as far as I tested.

https://github.com/yaneurao/YaneuraOu/blob/3996216fc99d68ec3cab1e0b0cbeaa1a3a08e0ed/source/position.cpp#L2053

https://github.com/yaneurao/YaneuraOu/blob/3996216fc99d68ec3cab1e0b0cbeaa1a3a08e0ed/source/position.cpp#L2107

I guess the bug that peanatsu found needs fixing regardless, but I would also like to suggest an option to disable this behavior completely, since it makes it simpler for GUIs communicating with the engines. Thanks for your time!

WandererXII commented 1 year ago

Also wanted to ask about this behavior, which is not necessarily related to this, but the solution might be: image

position sfen 3gk2nl/6gb1/2ns2spp/2ppppp2/7PL/3PPPP1P/2+l1GSB2/3+n2KR1/1+r5N1 w Pgsl4p 96 moves 8i8h 3h2g 8h8i

Returns:

info depth 1 seldepth 1 score cp -5530 nodes 245 nps 245000 time 1 pv P*8c 8i8c
info depth 2 seldepth 2 score cp -1 nodes 754 nps 377000 time 2 pv 2g3h
...
info depth 54 seldepth 2 score cp -1 nodes 134414 nps 1461021 time 92 pv 2g3h

The position is still not repeated with 8h8i, but I suppose the engine is trying to avoid the repeating move and marks it as such. This is fine for playing, but not for analyzing.

Stockfish has 'UCI_AnalyseMode':

I was thinking it would be a nice solution to have 'USI_AnalyseMode' in YaneuraOu, which would disable the superior position 31111 behavior and detect repetition only later, when it's actually repetition.

yaneurao commented 1 year ago

Thank you for the additional supplemental information and verification.

I promise to do something about this issue at the time of the next improvement of the search section. (planned for around October of this year).