This is a reimplementation of the techniques discussed in the paper A Conservative Approach for Transfer in Few-Shot Off-Dynamics Reinforcement Learning.
conda create --name food python=3.8
conda activate food
pip install -e .
get_env.py
function.your_env/model/PPO
. Careful, if you use this code and you normalize the environment, you also need to save the normalization class ob_rms
.Gather_Target_Trajectories.ipynb
and save them in the folder expert_trajectories
creation
and confidence
in envs
ray_config
main
file to include the new environmentAll the results are saved in a ray tune Experimentanalysis
. You can plot them in the Visualization.ipynb
notebook.
I follow MIT License. Please see the License file for more information.
This code is built upon the PPO Github. It also uses the H2O to build DARC, extracts one environment from GARAT and others from the official DARC Code.
If you find this technique useful and you use it in a project, please cite it:
@inproceedings{daoudi2024conservative,
title={A Conservative Approach for Transfer in Few-Shot Off-Dynamics Reinforcement Learning},
author={Daoudi, Paul and Robu, Bogdan and Prieur, Christophe and Barlier, Merwan and Dos Santos, Ludovic},
booktitle={Proceedings of the 2024 International Joint Conference on Artificial Intelligence},
year={2024}
}