genyrosk / gym-chess

A simple chess environment for openai/gym
MIT License
151 stars 38 forks source link

gym-chess ♟️

A simple chess environment for gym. It computes all available moves, including castling, pawn promotions and 3-fold repetition draws.

8
7
6
5
4
3
2
1
a b c d e f g h

Setup

Install the module:


pip install -e .

Environments

There are 3 environments available: v0, v1 and v2. The original v0 version contains legacy code and is no longer supported, so it's recommended to use v1 or v2.

Both v1 and v2 share the same basic API so in most scenarios can be used interchangeably. The v1 version is implemented in pure Python, while v2 has its core logic implemented in Rust and is over 100 times faster. Hence, if performance and speed are of the essence, v2 is the way to go.

Usage

You can import the Python classes directly, or create pre-defined environments with gym:


import gym
from gym_chess import ChessEnvV1, ChessEnvV2

env1 = ChessEnvV1()
env2 = ChessEnvV2()

env1 = gym.make('ChessVsSelf-v1')
env2 = gym.make('ChessVsSelf-v2')

You can also play against a random bot:


env = gym.make('ChessVsSelf-v1')

Play

Moves are pre-calculated for the current state and can be accessed from the environment. You can also access them in the form of actions from the environment action space.

Once you have chosen a move, make sure to convert it into an action (or select an action directly) and pass it to the environment to get the next state.


import random
from gym_chess import ChessEnvV1

env = ChessEnvV1() # or ChessEnvV2

# current state
state = env.state

# select a move and convert it into an action
moves = env.possible_moves
move = random.choice(moves)
action = env.move_to_actions(move)

# or select an action directly
actions = env.possible_actions
action = random.choice(actions)

# pass it to the env and get the next state
new_state, reward, done, info = env.step(action)

Reset the environment:


initial_state = env.reset()

Visualise the chess board and moves

Visualise the current state of the chess game:


env.render()
    -------------------------
 8 |  ♖  ♘  ♗  ♕  ♔  ♗  ♘  ♖ |
 7 |  ♙  ♙  ♙  ♙  ♙  ♙  ♙  ♙ |
 6 |  .  .  .  .  .  .  .  . |
 5 |  .  .  .  .  .  .  .  . |
 4 |  .  .  .  .  .  .  .  . |
 3 |  .  .  .  .  .  .  .  . |
 2 |  ♟  ♟  ♟  ♟  ♟  ♟  ♟  ♟ |
 1 |  ♜  ♞  ♝  ♛  ♚  ♝  ♞  ♜ |
    -------------------------
      a  b  c  d  e  f  g  h

You can also visualise multiple moves:


moves = env.possible_moves
env.render_moves(moves[10:12] + moves[16:18])

API

Initialize environment

ChessEnvV1(player_color="WHITE", opponent="random", log=True, initial_state=DEFAULT_BOARD)

env.get_possible_moves(state=state, player="WHITE", attack=False)

This method will calculate the possible moves. By default they are calculated at the current state for the current player (state.current_player).

Move specification:

Moves are encoded as either:

Moves are pre-calculated for every new state and stored in possible_moves.

State and differences between v1 and v2

v1 and v2 share most of the API, but the internals a little bit different.

For instance v1 stores the board matrix directly in the state as env.state, while in v2 the state is a dictionary where board can be accessed with env.state['board'].

>>> print(env.state) # v1
>>> print(env.state['board']) # v2
[[-3, -5, -4, -2, -1, -4, -5, -3],
 [-6, -6, -6, -6, -6, -6, -6, -6],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [6, 6, 6, 6, 6, 6, 6, 6],
 [3, 5, 4, 2, 1, 4, 5, 3]]

Every integer represents a piece. Positive pieces are white and negative ones are black.

Piece IDs are stored in constants that can be imported.

from gym_chess.envs.chess_v1 import (
    KING_ID,
    QUEEN_ID,
    ROOK_ID,
    BISHOP_ID,
    KNIGHT_ID,
    PAWN_ID,
)

The schema is:

EMPTY_SQUARE_ID = 0
KING_ID = 1
QUEEN_ID = 2
ROOK_ID = 3
BISHOP_ID = 4
KNIGHT_ID = 5
PAWN_ID = 6

Additional information can be found in other attributes of the environment:

env.current_player
env.white_king_castle_possible
env.white_queen_castle_possible
env.black_king_castle_possible
env.black_queen_castle_possible
env.white_king_on_the_board
env.black_king_on_the_board

Examples

Examples can be found in gym_chess/example. The v1 examples are valid for both the v1 and v2 environments.

Testing

Run all the tests with pytest.

Code linting and fixing

Code fixing is done with black with max line width of 100 characters with the command black -l 100 . No config needed.

Rust code is formatted with cargo fmt.

Building the Rust code

The v2 environment uses a chess engine implemented in Rust that uses PyO3 to bind to the Python interpreter. Rust is an amazing compiled language and this project holds 2 configurations:

Note: we haven't found a way to specify the Cargo toml file to either process, so copy the contents of the config you want to use into Cargo.toml to make it work.

Notes:

En-passant moves are not currently supported in the V1 environment.

References

Benchmarks

The v2 environment is over 100 times faster than the v1 environment. However, since most of the code is written in Rust, it's generally harder to debug.


from gym_chess import ChessEnvV1, ChessEnvV2

env_v1 = ChessEnvV1()
env_v2 = ChessEnvV2()

# v1: written in Python
>>> %timeit -n 50 -r 8 env_v1.get_possible_moves()
## 29.5 ms ± 872 µs per loop (mean ± std. dev. of 8 runs, 50 loops each)

# v2: compiled in Rust
>>> %timeit -n 50 -r 8 env_v2.get_possible_moves()
## 240 µs ± 31.9 µs per loop (mean ± std. dev. of 8 runs, 50 loops each)