ROSMO

Table of Contents

ROSMO
- Introduction
- Installation
- Usage
- BSuite
- Atari
- Citation
- License
- Acknowledgement
- Disclaimer

Introduction

This repository contains the implementation of ROSMO, a Regularized One-Step Model-based algorithm for Offline-RL, introduced in our paper "Efficient Offline Policy Optimization with a Learned Model". We provide the training codes for both Atari and BSuite experiments, and have made the reproduced results on Atari MsPacman publicly available at W&B.

Installation

Please follow the installation guide.

Usage

BSuite

To run the BSuite experiments, please ensure you have downloaded the datasets and placed them at the directory defined by CONFIG.data_dir in experiment/bsuite/config.py.

Debug run.

python experiment/bsuite/main.py -exp_id test -env cartpole

Enable W&B logger and start training.

python experiment/bsuite/main.py -exp_id test -env cartpole -nodebug -use_wb -user ${WB_USER}

Atari

The following commands are examples to train 1) a ROSMO agent, 2) its sampling variant, and 3) a MZU agent on the game MsPacman.

Train ROSMO with exact policy target.

python experiment/atari/main.py -exp_id rosmo -env MsPacman -nodebug -use_wb -user ${WB_USER}

Train ROSMO with sampled policy target (N=4).

python experiment/atari/main.py -exp_id rosmo-sample-4 -sampling -env MsPacman -nodebug -use_wb -user ${WB_USER}

Train MuZero unplugged for benchmark (N=20).

python experiment/atari/main.py -exp_id mzu-sample-20 -algo mzu -num_simulations 20 -env MsPacman -nodebug -use_wb -user ${WB_USER}

Citation

If you find this work useful for your research, please consider citing

@inproceedings{
  liu2023rosmo,
  title={Efficient Offline Policy Optimization with a Learned Model},
  author={Zichen Liu and Siyi Li and Wee Sun Lee and Shuicheng Yan and Zhongwen Xu},
  booktitle={International Conference on Learning Representations},
  year={2023},
  url={https://arxiv.org/abs/2210.05980}
}

License

ROSMO is distributed under the terms of the Apache2 license.

Acknowledgement

We thank the following projects which provide great references:

Disclaimer

This is not an official Sea Limited or Garena Online Private Limited product.

sail-sg / rosmo

readme