vmicheli / delta-iris

Efficient World Models with Context-Aware Tokenization. ICML 2024
https://arxiv.org/abs/2406.19320
GNU General Public License v3.0
52 stars 6 forks source link
artificial-inteligence autoencoders deep-learning machine-learning reinforcement-learning research transformers world-models

Efficient World Models with Context-Aware Tokenization (Δ-IRIS)

Efficient World Models with Context-Aware Tokenization
Vincent Micheli*, Eloi Alonso*, François Fleuret

TL;DR Δ-IRIS is a reinforcement learning agent trained in the imagination of its world model.

Δ-IRIS agent alternatively playing in the environment and its world model (download here) https://github.com/vmicheli/delta-iris/assets/32040353/ff2dc7a7-fa0a-4338-8f77-1637dff8642d

Setup

Launch a training run

Crafter:

python src/main.py

The run will be located in outputs/YYYY-MM-DD/hh-mm-ss/.

By default, logs are synced to weights & biases, set wandb.mode=disabled to turn logging off.

Atari:

python src/main.py env=atari params=atari env.train.id=BreakoutNoFrameskip-v4

Note that this Atari configuration achieves slightly higher aggregate metrics than those reported in the paper. Here is the updated table of results.

Configuration

Run folder

Each new run is located in outputs/YYYY-MM-DD/hh-mm-ss/. This folder is structured as:

outputs/YYYY-MM-DD/hh-mm-ss/
│
└─── checkpoints
│   │   last.pt
│   │   optimizer.pt
│   │   ...
│   │
│   └── dataset
│      │
│      └─ train
│        │   info.pt
│        │   ...
│      │
│      └─ test
│        │   info.pt
│        │   ...
│
└─── config
│   │   trainer.yaml
│   │   ...
│
└─── media
│   │
│   └── episodes
│      │   ...
│   │
│   └── reconstructions
│      │   ...
│
└─── scripts
│   │   resume.sh
│   │   play.sh
│
└─── src
│   │   main.py
│   │   ...
│
└─── wandb
    │   ...

Pretrained agent

An agent checkpoint (Crafter 5M frames) is available on the Hugging Face Hub.

To visualize the agent or play in its world model:

Cite

If you find this code or paper useful, please use the following reference:

@inproceedings{
micheli2024efficient,
title={Efficient World Models with Context-Aware Tokenization},
author={Vincent Micheli and Eloi Alonso and François Fleuret},
booktitle={Forty-first International Conference on Machine Learning},
year={2024},
url={https://openreview.net/forum?id=BiWIERWBFX}
}

Credits