openai / weak-to-strong

MIT License
2.48k stars 299 forks source link

Preprocessed Chess Puzzle Data #10

Open y12uc231 opened 8 months ago

y12uc231 commented 8 months ago

Hi,

I was trying to reproduce the results for the chess puzzle dataset and it seems like the original dataset was preprocessed to convert FEN positions to a set of moves. But there can be multiple set of moves to reach a specific board position. Is it possible for you to share the preprocessing script or the preprocessed data used in the experiments.

Thanks, Satya

SecDante commented 8 months ago

Thank

y12uc231 commented 8 months ago

Hi all,

Bumping this up in case there is anything I am missing or if there is any other info needed from my end. Appreciate helping with this.

-Satya

WuTheFWasThat commented 8 months ago

i believe the data for the sequence of moves exists somewhere, @pavel-izmailov would know details

pavel-izmailov commented 8 months ago

Hey @y12uc231, the original data from lichess is indeed in FEN notation, but also each puzzle is extracted from a real game. You can find a database of puzzles as a csv here. Each entry should contain a game id from which the position was extracted. Then, you can use the lichess api to extract the game from its id, and convert it to a move sequence notation.