mcostalba / chess_db

GNU General Public License v3.0
22 stars 5 forks source link

JSON header output now working with parser book <pgn file> full <head… #15

Closed sshivaji closed 7 years ago

sshivaji commented 7 years ago

…er_filename.json>. Will update docs later

Header output for PGN files. This is very useful for client programs. Now we can search for a position in addition to criteria such as player is "Carlsen" and elo > 2700. It is up to the client program to consume the JSON output and put it into a database or leverage it somehow.

sshivaji commented 7 years ago

For FYI, I used the position database and then did the query against the headers to refine the search in the past.

mcostalba commented 7 years ago

Thanks, but I prefer to keep the parser simple and do not filter on tags. I think we have to choose between moves and header info. This parser is really not meant to deal with such filtering.

sshivaji commented 7 years ago

However, the parser will NOT filter on tags. The parser will only output the tags for other programs to consume. Do you think that makes sense? That was my intention (not to allow the parser to filter at all).

mcostalba commented 7 years ago

I think we should focus on moves, patterns, etc.

There are already very good solutions for text filering, this is not something we can add some value above existing solutions, so it is better to avoid this and make our code simple and focused.

On Monday, December 5, 2016, Shivkumar Shivaji notifications@github.com wrote:

However, the parser will not filter on tags. The parser will only output the tags for other programs to consume. Do you think that makes sense. That was my intention (not to allow the parser to filter at all).

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/mcostalba/chess_db/pull/15#issuecomment-264966159, or mute the thread https://github.com/notifications/unsubscribe-auth/ABDGARPotp162uDT8TKEPatR-bnoriDsks5rFHK9gaJpZM4LEoGD .

sshivaji commented 7 years ago

I think your solution is the fastest to generate the header output, thats why I feel its a good addition. On the 2.2M game database, this was done in just 20 seconds! (in addition to the game indexes).

Other solutions including my own handle text filtering well but generation of the source data takes much much longer. In addition, most other PGN parsers are broken for bad data, your parser is now quite robust!

I agree that the find function should NOT do text header filtering in this project. My only request is to keep code that generates the header data due to it being quite difficult to get the data generated via other solutions.