Quentin Delfosse, Jannis Blüml, Bjarne Gregori, Sebastian Sztwiertnia
Inspired by the work of Anand et. al., we present OCAtari, an improved, extended and object-centric version of their ATARI ARI project. \ The Arcade Learning Environment allows us to read the RAM state at any time of a game. This repository provides a wrapper for the well known Gynmasium project, that uses the state of the ram and reverse engineering to provide object centric representation of the screen. It provides code for benchmarking, testing and generating object-centric representations of states.
:heavy_exclamation_mark: HERE IS A LINK TO THE DOCUMENTATION :bookmark_tabs:
This repository is structured into multiple folder:
You can install OCAtari in multiple ways, the recommended is to use the provided Dockerfile to install all requirements, like the Atari ROMs and gymnasium.
You can also simply:
pip install ocatari
pip install "gymnasium[atari, accept-rom-license]"
If you want to modify the code, you can clone this repo and run:
python setup.py install
or if you want to modify the code python setup.py develop
To use the OCAtari environments:
from ocatari.core import OCAtari
import random
env = OCAtari("Pong", mode="ram", hud=True, render_mode="rgb_array")
observation, info = env.reset()
action = random.randint(0, env.nb_actions-1)
obs, reward, terminated, truncated, info = env.step(action)
If you are using OCAtari for your scientific pubplease use 'ram' mode insteadlications, please cite us:
@inproceedings{Delfosse2023OCAtariOA,
title={OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments},
author={Quentin Delfosse and Jannis Blüml and Bjarne Gregori and Sebastian Sztwiertnia and Kristian Kersting},
year={2023}
}
Alien | Amidar | Assault | Asterix | Asteroids | Atlantis | BattleZone | BankHeist | BeamR. | Berzerk | Bowling | Boxing | Breakout | Carnival | Centipede | ChopperC. | CrazyC. | DemonA. | DonkeyK. | FishingD. | Freeway | Frostbite | Gopher | Hero | IceHockey | Jamesbond | Kangaroo | Krull | Montezum. | MsPacman | Pacman | Pitfall | Pong | PrivateE. | Q*Bert | RiverRaid | RoadR. | Seaquest | Skiing | SpaceInv. | Tennis | TimePilot | UpNDown | Venture | VideoP. | YarsR. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
A list of all gymnasium games can be found in the Gymnasium Documentation
OCAtari supports two different modes to extract objects from the current state:
Vision Extraction Mode (VEM): Return a list of objects currently on the screen with their X, Y, Width, Height, R, G, B Values, based on handwritten rules used on the visual representation.
Ram Extraction Mode (REM): Uses the object values stored in the RAM to detect the objects currently on the screen.
A better example how to run OCAtari is given with our demo files showing you how to run each game with a provided agent.
Use the demo files in the scripts/demo folder to test it yourself. You can set the mode to 'raw', 'vision' or 'revised' in line 10 of the demo script. You can also run the demo file with an already trained agent or your own developed agent. You can use the -p flag in the command to run the demo file by an agent and let the agent play the game. Here is an example:
python demo_pong.py -p models/Pong/model_50000000.gz
More information can be found in this ReadMe
With env.objects
one can access the list of objects found in the current state. Note that these lists can differ depending on the mode you used initiating the environment
OCAtari can be used to generate datasets consisting of a represenation of the current state in form of an RGB array and a list of all objects within the state. More information can be found in the dataset_generation folder.
As trained agents as well as to reproduce our results, we recomment to use the agents of this repo.
In most of our scripts we added the following line to make the deterministic and easier to reproduce make_deterministic(env, 42)
. This line can and should be removed if this is not the desired behavior.
As seeds we used 0 for evaluating the metrics and 42 for generating the dataset.