reinforcement-learning-kr / lets-do-irl

Inverse RL algorithms (APP, MaxEnt, GAIL, VAIL)
MIT License
713 stars 114 forks source link

ppo save expert demo #6

Open francisduan opened 4 years ago

francisduan commented 4 years ago

hi, how am i supposed to save expert demo in ppo main?

gitouni commented 3 years ago

PPO is a method of reinforcement learning. However app, maxent and gail are all inverse reinforcement learning method. Due to the emergence of policy-based inverse reinforcement learning algorithms, you can use PPO with any inverse reinforcement learning algorithm to complete the training. References:

Ng A Y, Russell S J. Algorithms for inverse reinforcement learning[C]//Icml. 2000, 1: 2. Ho J, Gupta J, Ermon S. Model-free imitation learning with policy optimization[C]//International Conference on Machine Learning. PMLR, 2016: 2760-2769.