mcostalba / chess_db

GNU General Public License v3.0
22 stars 5 forks source link

Pawn Structure and Material Search #13

Closed sshivaji closed 7 years ago

sshivaji commented 7 years ago

It is quite common for tournament players to look for games with the same pawn structure. An example of this is IQP positions (Isolated Queen Pawn). The positions can come from many openings (e.g. Panov Botvinnik attack in the Caro Kann, Nimzo Indian, Queens Gambit Accepted, Queens Gambit Declined, c3 Sicilian, French Defense exchange variation just to name a few).

Thus position search (which is now well supported) wont generally work. An example input is white pawns on d4, a2, b3, f2, g2, h2. Is it possible to find all matching games with a polyglot key search?

Another variant is searching for a material imbalance. I will outline a few common cases:

  1. Search for 2 bishops vs bishop and knight imbalance.
  2. Search for endgames of Rook and 2 pawns vs the lone rook.
  3. Search for positions where one side is a pawn up.

Would be really cool to support both conditions (pawn structure and material imbalance).

I am happy to help write code if an efficient algorithm for polyglot probing can be decided. I wonder how feasible this is.

mcostalba commented 7 years ago

We can save material and pawns keys, along positions keys, this will make db bigger but I don't think it matters, db size is no more an issue today, fast and powerful features is more important.

On Saturday, December 3, 2016, Shivkumar Shivaji notifications@github.com wrote:

It is quite common for tournament players to look for games with the same pawn structure. An example of this is IQP positions (Isolated Queen Pawn). The positions can come from many openings (e.g. Panov Botvinnik attack in the Caro Kann, Nimzo Indian, Queens Gambit Accepted, Queens Gambit Declined, c3 Sicilian, French Defense exchange variation just to name a few).

Thus position search (which is now well supported) wont generally work. An example input is white pawns on d4, a2, b3, f2, g2, h2. Is it possible to find all matching games with a polyglot key search?

Another variant is searching for a material imbalance. I will outline a few common cases:

  1. Search for 2 bishops vs bishop and knight imbalance.
  2. Search for endgames of Rook and 2 pawns vs the lone rook.
  3. Search for positions where one side is a pawn up.

Would be really cool to support both conditions (pawn structure and material imbalance).

I am happy to help write code if an efficient algorithm for polyglot probing can be decided. I wonder how feasible this is.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mcostalba/chess_db/issues/13, or mute the thread https://github.com/notifications/unsubscribe-auth/ABDGAf7g3weErfOSFWnbHwFjFQbdSKdzks5rEUiUgaJpZM4LDQwm .

sshivaji commented 7 years ago

Ic, so your proposal is to build a polyglot position hash with just pawn structure information and material imbalance? I agree that size is a non-issue these days.

Out of curiosity, how much harder is it to support CQL - https://timkr.home.xs4all.nl/chess2/cql.htm. The spec is at http://www.gadycosteff.com/chess_query_language.pdf

Supporting that somehow will essentially support most queries imaginable.

mcostalba commented 7 years ago

@sshivaji if the aim is to support a general and open format chess query then we need something completely different.

CQL is dead stuff from more than 10 years ago, but there is something good in it. Instead of implementing a dead interface maybe makes sense to get the general idea of a query language for chess and to build a system able to support it.

In this view our simple polyglot book is a completely wrong solution. We need something completely different. One thing that comes to my mind is a DB with all the parsed moves of all the games: a long sequence of games, each one represented by a list of moves in Stockfish's Move type (that is a 16bit value).

Then we can design a search tool composed by a selected number of threads, where each thread scouts a subset of the full DB looking for matches. You can speed up search increasing threads number in a multi-threads fashion.

Each threads re-plays the moves of each game and after making a move checks for the conditions to match. In case of a match the thread stores the game offset and the move number with first occurrence of the match, then proceeds with next one.

At the end all the games offsets are printed-out as result of the search.

I can easily modify the parser to output a DB of parsed moves, but I'd prefer to start from scratch with a new project that I will call scoutfish and I will start as a clone of Stockfish, because we need full engine power here, multi-thread included. Then I will import the needed parser code from this project.

If I can suggest you, I'd like you to search for some possible query language suitable for our case. We don't need the exact conditions or chess specific formulas now, just to figure out the main structure and framework of the query interface, e.g. JASON based, SQL based, rule list, condition list, etc.. It should be something already existing and proved that we can easily customize for our case.

mcostalba commented 7 years ago

Here we are: https://github.com/mcostalba/scoutfish

gbtami commented 7 years ago

As I know Rybka Aquarium and Jose are the two supporting sql databases, others like Chessbase, Scid and all its descendants use different native database formats. In PyChess I'v started to add SQL database support https://github.com/pychess/pychess/tree/master/lib/pychess/Database Game table stores all parsed (2 byte) moves in one movelist LargeBinary field. My feeling is that sql way can never be as fast as native formats (just good enough), but much straightforward to use it from Python because native database softwares has no Python bindings at all.

mcostalba commented 7 years ago

I have open an issue https://github.com/mcostalba/scoutfish/issues/1 to discuss about a possible query scheme.