zhengxj28 / lifelong_rl_with_rm

Lifelong Reinforcement Learning with Temporal Logic and Reward Machines
1 stars 0 forks source link

Lifelong Reinforcement Learning with Temporal Logic Formulas and Reward Machines

Evaluations of Different Q-value Composition Methods

Experiments in the OfficeWorld domain is included in the file src/_my_office_world/Experiment1_final.py.

Experiments in the MineCraft domain is included in the file src/_my_craft/Experiment2_final.py.

Evaluation of Different Representations of Target Tasks

Experiments in the OfficeWorld domain is included in the file src/_my_office_world/Experiment4office.py.

Experiments in the MineCraft domain is included in the file src/_my_craft/Experiment4craft.py.

Evaluation of LSRM

Experiments in the OfficeWorld domain is included in the file src/_my_office_world/Experiment3office.py.

Experiments in the MineCraft domain is included in the file src/_my_craft/Experiment3craft.py.

If run the experiment, set is_plot=False. If plot the result, set is_plot=True.

All the experiments use Q-values normalization, so use_normalize=True.

Citations

The paper has been published on Knowledge-Based Systems, 2022.

@article{zheng2022lifelong,
  title={Lifelong reinforcement learning with temporal logic formulas and reward machines},
  author={Zheng, Xuejing and Yu, Chao and Zhang, Minjie},
  journal={Knowledge-Based Systems},
  volume={257},
  pages={109650},
  year={2022},
  publisher={Elsevier}
}