Load (human) chess play data

sotetsuk / pgx

♟️ Vectorized RL game environments in JAX

http://sotets.uk/pgx/

Apache License 2.0

367 stars 23 forks source link

Load (human) chess play data #1165

Open NZ99 opened 6 months ago

NZ99 commented 6 months ago

Hi, first of all thank you very much for your work on Pgx!

I'm opening this issue because I am hoping to be able to leverage Pgx's chessenvironment for a project on scaling laws in RL. As part of this project, I would like to load some large scale supervised learning data (obtained from various sources, including Lichess, CCRL and a few others) for supervised training of an AlphaZero-like model which will later continue training through AlphaZero-like self-play as you implemented in one of the examples.

I was wondering whether you have thoughts/advice/help when it comes to making that data (all in the form of PGN files) available for use with Pgx. Within the chess environment there are _from_fen and _to_fen, but according to this issue these are neither supposed to be exposed to the user nor well-tested. Further, the issue says that there are no plans for such a feature at the moment. For Go as far as I can tell there was interest in having such an option.

How would you recommend an user make an existing PGN file usable for supervised learning in a format compatible with Pgx's? If there are no plans for such a feature I would really appreciate any help/advice/general tips or resources. Thanks!

NZ99 commented 5 months ago

Something like OpenSpiel ParseSANMove would be really helpful.

sotetsuk commented 5 months ago

Hi, sorry for the late response and thank you for the request!

Given your request, we added from_fen/to_fen to public (but experimental API). So far, you can use them like

from pgx.experimental.chess import from_fen, to_fen

state = from_fen("k7/8/8/8/2N5/8/P7/7K w - - 0 1")
print(to_fen(state))

Note that they are still subject to (large) changes. Please let me know if you find any bug or problem. Currently, we only support fen format. Does it enough for your use case? We suppose you can convert other formats to fen format with some other libraries (but not sure).

You can use them by install pgx from latest main branch (2.1.0-rc0).

pip install git+ssh://git@github.com/sotetsuk/elf-opengo-jax.git

pip install git+https://github.com/sotetsuk/elf-opengo-jax.git

NZ99 commented 5 months ago

Thank you, this is a really useful addition. One aspect that is still missing (please correct me if I'm wrong) though is an efficient way to also parse the moves themselves, like OpenSpiel's ParseSANMove. Of course these can be obtained by e.g. checking all potential moves and checking which ones match the FEN of the successive state, but it would be helpful to have a dedicated method for that.

sotetsuk commented 5 months ago

Hi! Thank you for your comment! Yeah I agree that it's useful but so far we don't have a plan to implement it.

I'll keep this PR open and contributions are welcome. Someone who want this feature may add it to experimental like from_fen and to_fen