Kautenja / gym-tetris

An OpenAI Gym interface to Tetris on the NES.
Other
47 stars 8 forks source link
nes-py openai-gym tetris

gym-tetris

BuildStatus PackageVersion PythonVersion Stable Format License

An OpenAI Gym environment for Tetris on The Nintendo Entertainment System (NES) based on the nes-py emulator.

Installation

The preferred installation of gym-tetris is from pip:

pip install gym-tetris

Usage

Python

You must import gym_tetris before trying to make an environment. This is because gym environments are registered at runtime. By default, gym_tetris environments use the full NES action space of 256 discrete actions. To constrain this, gym_tetris.actions provides an action list called MOVEMENT (20 discrete actions) for the nes_py.wrappers.JoypadSpace wrapper. There is also SIMPLE_MOVEMENT with a reduced action space (6 actions). For exact details, see gym_tetris/actions.py.

from nes_py.wrappers import JoypadSpace
import gym_tetris
from gym_tetris.actions import MOVEMENT

env = gym_tetris.make('TetrisA-v0')
env = JoypadSpace(env, MOVEMENT)

done = True
for step in range(5000):
    if done:
        state = env.reset()
    state, reward, done, info = env.step(env.action_space.sample())
    env.render()

env.close()

NOTE: gym_tetris.make is just an alias to gym.make for convenience.

NOTE: remove calls to render in training code for a nontrivial speedup.

Command Line

gym_tetris features a command line interface for playing environments using either the keyboard, or uniform random movement.

gym_tetris -e <environment ID> -m <`human` or `random`>

Environments

There are two game modes define in NES Tetris, namely, A-type and B-type. A-type is the standard endurance Tetris game and B-type is an arcade style mode where the agent must clear a certain number of lines to win. There are three potential reward streams: (1) the change in score, (2) the change in number of lines cleared, and (3) a penalty for an increase in board height. The table below defines the available environments in terms of the game mode (i.e., A-type or B-type) and the rewards applied.

Environment Game Mode reward score reward lines penalize height
TetrisA-v0 A-type
TetrisA-v1 A-type
TetrisA-v2 A-type
TetrisA-v3 A-type
TetrisB-v0 B-type
TetrisB-v1 B-type
TetrisB-v2 B-type
TetrisB-v3 B-type

info dictionary

The info dictionary returned by the step method contains the following keys:

Key Type Description
current_piece str the current piece as a string
number_of_lines int the number of cleared lines in [0, 999]
score int the current score of the game in [0, 999999]
next_piece str the next piece on deck as a string
statistics dict the number of tetriminos dispatched (by type)
board_height int the height of the board in [0, 20]

Citation

Please cite gym-tetris if you use it in your research.

@misc{gym-tetris,
  author = {Christian Kauten},
  howpublished = {GitHub},
  title = {{Tetris (NES)} for {OpenAI Gym}},
  URL = {https://github.com/Kautenja/gym-tetris},
  year = {2019},
}

References

The following references contributed to the construction of this project.

  1. Tetris (NES): RAM Map. Data Crystal ROM Hacking.
  2. Tetris: Memory Addresses. NES Hacker.
  3. Applying Artificial Intelligence to Nintendo Tetris. MeatFighter.