Avalon is a 3D video game environment and benchmark designed from scratch for reinforcement learning research. In Avalon, an embodied agent (human or computer) explores a procedurally generated 3D environment, attempting to solve tasks that involve navigating terrain, hunting or gathering food, and avoiding hazards.
Avalon is unique among existing RL benchmarks in that the reward function, world dynamics, and action space are the same for every task, with tasks differentiated solely by altering the environment: its 20 tasks, ranging in complexity from eat and throw to hunt and navigate, each creates worlds in which the agent must perform specific skills to survive. This setup enables investigations of generalization within tasks, between tasks, and compositional tasks that require combining skills learned from previous tasks.
Avalon includes a highly efficient game engine, a library of baselines, and a benchmark with scoring metrics evaluated against hundreds of hours of human performance, all of which are open-source and publicly available. In addition, we find that standard RL baselines progress on most tasks but are still far from human performance, suggesting Avalon is challenging enough to advance the quest for generalizable RL.
Check out our research paper for a deeper explanation of why we built Avalon.
Use Avalon like any other gym
environment.
from avalon.agent.godot.godot_gym import GodotEnvironmentParams
from avalon.agent.godot.godot_gym import TrainingProtocolChoice
from avalon.agent.godot.godot_gym import AvalonEnv
from avalon.common.log_utils import configure_local_logger
from avalon.datagen.env_helper import display_video
configure_local_logger()
env_params = GodotEnvironmentParams(
resolution=256,
training_protocol=TrainingProtocolChoice.SINGLE_TASK_FIGHT,
initial_difficulty=1,
)
env = AvalonEnv(env_params)
env.reset()
def random_env_step():
action = env.action_space.sample()
obs, reward, done, info = env.step(action)
if done:
env.reset()
return obs
observations = [random_env_step() for _ in range(50)]
display_video(observations, fps=10)
For a complete example of creating random worlds, taking actions as an agent, and displaying the resulting observations, see gym_interface_example.
Since we designed Avalon as a high-performance RL environment, we tailored Avalon to run in the cloud on headless Linux servers with NVIDIA GPUs. However, it should also work on macOS.
Avalon relies on a custom Godot binary optimized for headless rendering and performance. If you intend to inspect, debug or build custom levels, you'll also want the accompanying editor:
pip install avalon-rl==1.0.0
# needed to run environments
python -m avalon.install_godot_binary
Note: the binary will be installed in the package under
avalon/bin/godot
by default to avoid cluttering your system. Pure-pip binary packaging is a work in progress.
Avalon requires NVIDIA GPU on Linux, as we set up the Linux builds for headless GPU rendering. It requires pytorch>=1.12.0 with CUDA to be installed.
sudo apt install --no-install-recommends libegl-dev libglew-dev libglfw3-dev libnvidia-gl libopengl-dev libosmesa6 mesa-utils-extra
pip install avalon-rl
python -m avalon.install_godot_binary
python -m avalon.common.check_install
If you're looking to use our RL code, you'll additionally need the avalon-rl[train]
extras package: pip install avalon-rl[train]
On Mac, we do not require an NVIDIA GPU, but the environment rendering is not headless - you'll see a godot window pop up for each environment you have open.
brew install coreutils
pip install avalon-rl
python -m avalon.install_godot_binary
python -m avalon.common.check_install
We also have Docker images to run Avalon and train our RL baselines. They require an NVIDIA GPU on the host.
docker build -f ./docker/Dockerfile . --target train --tag=avalon/train
# start the container with an interactive bash terminal
# to enable wandb, add `-e WANDB_API_KEY=<your wandb key>`
docker run -it --gpus 'all,"capabilities=compute,utility,graphics"' avalon/train bash
# in the container, try running
python -m avalon.common.check_install
# or launch, e.g., a PPO training run with
python -m avalon.agent.train_ppo_avalon
You can use the dev
image to explore the bundled notebooks or to build on top of Avalon
docker build -f ./docker/Dockerfile . --target dev --tag=avalon/dev
# The default dev image command starts a Jupyter Notebook and exposes it on port 8888.
# A typical dev setup is to expose that notebook and map the local repo to the project repo as a volume:
docker run -it -p 8888:8888 -v $(pwd):/opt/projects/avalon --gpus 'all,"capabilities=compute,utility,graphics"' avalon/dev
Using Avalon in your own RL code:
env.debug_act
and the godot editor.Using our RL library:
Building on Avalon or creating new tasks:
Running Avalon in VR: Generate and play worlds on an Oculus Quest.
Making an entirely custom non-Avalon RL environment using Godot
We can find the final baseline model weights in the results notebook.
@inproceedings{avalon,
title={Avalon: A Benchmark for {RL} Generalization Using Procedurally Generated Worlds},
author={Joshua Albrecht and Abraham J Fetterman and Bryden Fogelman and Ellie Kitanidis and Bartosz Wr{\'o}blewski and Nicole Seo and Michael Rosenthal and Maksis Knutins and Zachary Polizzi and James B Simon and Kanjun Qiu},
booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2022},
url={https://openreview.net/forum?id=TzNuIdrHoU}
}
Avalon was developed by Generally Intelligent, an independent research company developing general-purpose AI agents with human-like intelligence that we can safely deploy in the real world. Check out our about page to learn more, or our careers page if you're interested in working with us!