mcostalba / scoutfish

Chess Query Engine
GNU General Public License v3.0
156 stars 23 forks source link

Inconsistent ply returnd by scout #33

Closed gbtami closed 7 years ago

gbtami commented 7 years ago

I tested scoutfish to filter games after 1.e4 using two different scoutfish rules. I find that "sub-fen" rule gives correct plies:

tamas@tami:~/pychess/scoutfish$ src/scoutfish scout 'pgn/famous_games.scout { "sub-fen": "rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR" }'

{
    "moves": 28873,
    "match count": 229,
    "moves/second": 4812166,
    "processing time (ms)": 6,
    "matches":
    [
        { "ofs": 666, "ply": [1] }, 
        { "ofs": 2008, "ply": [1] }, 
        { "ofs": 3313, "ply": [1] }, 

but "white-move" rule gives incorrect plies (ply - 1):

tamas@tami:~/pychess/scoutfish$ src/scoutfish scout 'pgn/famous_games.scout { "white-move": "e4" }'

{
    "moves": 28873,
    "match count": 390,
    "moves/second": 7218250,
    "processing time (ms)": 4,
    "matches":
    [
        { "ofs": 0, "ply": [10] }, 
        { "ofs": 666, "ply": [0] }, 
        { "ofs": 2008, "ply": [0] }, 
        { "ofs": 3313, "ply": [0] }, 
mcostalba commented 7 years ago

This is the expected behavior. From the readme:

the ply number: this is the number of (half) moves before reaching the first position in the game that satisfies the given condition.

gbtami commented 7 years ago

If it's easier to emit ply before matching a condition I'm fine with it, but this is not true for "sub-fen" rule now.

mcostalba commented 7 years ago

@gbtami why not?

After e4, ply is 1 as it should be.

Ply is 0 for starting position:

{'sub-fen': 'rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR'}

gbtami commented 7 years ago

In PyChess when I filter game list with a scout query I want to show the first matching position on the game preview board. What ply should I use for queries containing complex rules mixed with not "sub-fen" and "sub-fen" rules ?

Btw. what is the idea behind giving back the prev ply (not the matching) for other rules?

mcostalba commented 7 years ago

@gbtami I suggest you to be consistent with this definition of ply match. It is the only one that handles in an uniform way all the rules and that properly considers first position of the game (that not always is starting position, for instance can be a given FEN).

gbtami commented 7 years ago

@mcostalba I think you don't understand what I want to say. I know my english is not perfect :)

For a user perspective 1.e4 (as "white-move") and rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR (as "sub-fen") are both exactly ply 1. When I search a database for something I like to see positions exactly matching my query not the positions before matching it. Now "sub-fen" gives back the ply when the match is happening and it's correct. I just like other rules work this way too if it's possible.

gbtami commented 7 years ago

After a second look the sum of current behavior is the following. Regarding what ply scouting gives back search conditions can fall into 3 categories.

  1. ply=0 returned: "result"
  2. matched ply returned: "sub-fen", "material", "imbalance", "stm"
  3. matched ply - 1 returned: "moved", "captured", "white-move", "black-move"

To be consistent I suggest to return the latest ply in case 1. and the matched ply in case 3. I don't think anyone using a GUI sometimes want to see the matching position for example in "material" searches but the positions before matching for example in "captured" searches.

gbtami commented 7 years ago

Created fix in pull request #37

mcostalba commented 7 years ago

@gbtami sorry to be pedantic, but current definition is:

the ply number: this is the number of (half) moves before reaching the first position in the game that satisfies the given condition.

This is a single, uniform and consistent definition. If you find something that does not conform to the above please report to me, otherwise this is the definition. Regarding 'move' rules, the returned ply is the position before the move is done, not after and this is the correct way. For instance if I search for { "white-move": "e4" } I want to get a ply corresponding to a position where the side to move is white, not black. So I want to get ply = 0 (the starting position), not ply = 1, because when ply = 1 is black to move.

gbtami commented 7 years ago

In my logic 1.e4 and rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR represents the same state of the game. If I search for them I like to get the same (ply = 1) on both cases.

Plies returned by "sub-fen", "material", "imbalance", "stm" does not conform to the current definition now. If you change "(half) moves before reaching the first" to "(half) moves when reaching the first" the proposed patch fixes this.

Let me explain it from a GUI perspective. In a GUI there is a game list part with header data with columns like white, black, event, site, date, result, etc. then a preview board and an opening tree list with possible moves of position shown in the board. Say we are in the starting position. Game list is not filtered. Now when a user clicks on e4 in opening tree list the board shows the white pawn moved to e4 and the game list is filtered showing games starting with 1.e4 only. Same case if the user first not clicking in opening tree but clicking on 1.e4 in the game text below the board. And finally I like the same if he will create a {"white-move": "e4"} in the filter dialog.

mcostalba commented 7 years ago

@gbtami ok, let's reopen this for now.

mcostalba commented 7 years ago

@gbtami I think it is enough to add 1 to retuned ply. This should work for you in all cases.

gbtami commented 7 years ago

This is exactly what the proposed patch is doing in move related rules. Doing the same (decide whether the returned ply is move related or not) in Python GUI code causing much more unnecessary complexity.

mcostalba commented 7 years ago

@gbtami no, I mean to add 1 always: to make ply count starting from 1

gbtami commented 7 years ago

This will not solve the issue. If we add 1 always, plies returned by move related rules will be ok but all other (sub-fen/material/imbalance/etc.) will be wrong (pointing to the next position after the rule satisfied.

mcostalba commented 7 years ago

I don'think so. Can you do a specific example please?

gbtami commented 7 years ago

You can try it interactively using PyChess git version. Two screen shots how it's working now o-o-o_as_sub-fen o-o-o_as_white-move

If I add +1 to ply always the second ("white-move") will show the correct position when O-O-O is happning, but firs will show the position on next move happened after O-O-O.

mcostalba commented 7 years ago

Yes, this is exactly what I'd expect!!

The second one is ok, the first one is ok too because you are not looking for a O-O-O, but for a specific sub-fen and the tool points you exactly there: at the specific sub-fen you were looking for.

So IMO it is correct and predicatble now.

gbtami commented 7 years ago

I agree that the first is OK. But in the second one I expect the position where O-O-O is happened. Not the previous one! Similar but maybe better example is "moved" moved If you are searching the very first position where a King moved do you expect to see the board where a Bishop moved as above? If your answer is yes, then we can close this issue as seems we simply think different and I will never convince you.

To summarize my arguments as a chess player when I search for "something" in a .pgn I want to see positions where that "something" is happening, not positions before it will happen. In above "moved" example I want to know what was the very first move of a king. Now I have to do an extra click on "Next" button to see what I'm looking for.

mcostalba commented 7 years ago

Ok, I understand your point. OTH this per-rule ply definition is really a can of worms (for instance I think your code is broken with streaks). It would be much easier and consistent to simply rename the rules:

captured -> next-capture
moved    -> next-move

This avoid to introduce a really complex code that will break in subtle ways because it is deeply wrong: we assign two different meanings to the same variable, in this case plies. This is bad practice and will fail in the long term.

Eventually you can take care of this in the GUI, where you don't have to deal with searching and scouting a DB, but to print the results. Also for a GUI I'd prefer ply definition sent to GUI to remain consistent and simple. To further highlight this point I am tempted to leave the ply to start from 0, so that it is clear that we are returning a computer-oriented ply, and it is up to the GUI to handle it in the way that better adapts to the user context.

gbtami commented 7 years ago

I think all chess player and programmer uses the word "ply" in the same way as https://chessprogramming.wikispaces.com/Ply describes it. "The word Ply denotes a half-move, that is a move of one side only. When we speak of a "6 ply search", we mean three full moves - something like 1. e2e4 e7e5 2. g1f3 b8c6 3. b1c3 g8f6." In this meaning e4 is the first ply move , e5 is the second ply move, etc. and in starting position where no (half) move was taken yet ply is 0. The PGN standard uses "ply" in same meaning and every chess program saves tag pair [PlyCount "6"] for this game snippet. I think this was never the question in this issue.

"we assign two different meanings to the same variable, in this case plies. This is bad practice and will fail in the long term." No. I'm not suggesting anything like this at all!

The discussion is going only about what semantic of some particular rule we like to see. Rules like "sub-fen", "material", "imbalance", "stm" seems have no problems. Take this 4 ply long Scandinavian .pgn [Event "?"] [Site "?"] [Date "?"] [Round "?"] [White "?"] [Black "?"] [Result ""] [PlyCount "4"] 1.e4 d5 2.exd5 Qxd5

We agree that if I'm looking for accepted Scandinavian games, query {"sub-fen": "rnbqkbnr/ppp1pppp/8/3P4/8/8/PPPP1PPP/RNBQKBNR"} have to return ply=3.

But the semantic of {"white-move": "exd5"} and {"captured": "P"} are debatable. You say you like to see "white-move": "exd5" in the meaning "white next move from this position(ply=2) was exd5". I say I like to see "white-move": "exd5" in the meaning "white move led to this position(ply=3) was exd5".

So no "assign two different meanings to the same variable". We just like to see different semantic and the correspondent ply to move related rules. My arguments to prefer my version are a) in above game the third ply move was exd5 b) when I'm looking for exd5 I like to see white pawn on d5.

If you want to preserve rules with your semantic I'm Ok with it, but I suggest to rename these rules to something like "white-next-move", "black-next-move", "next-moved", "next-captured" and the plies returned by them can be exactly the same as they are now. At the same time I suggest to add my above explained semantic to rules "white-move", "black-move", "captured", "moved" giving back ply values corresponding to ply values as we count them in .pgn

gbtami commented 7 years ago

Being curious I'v searched the net how this works in Chessbase. Look at 4:50 in https://www.youtube.com/watch?v=JzNQA33eXg0

mcostalba commented 7 years ago

Fixed.