PPO+GAIL performing worse than only PPO

Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

https://unity.com/products/machine-learning-agents

Other

16.89k stars 4.12k forks source link

PPO+GAIL performing worse than only PPO #5657

Closed MrOCW closed 2 years ago

MrOCW commented 2 years ago

Hi, This is an agent learning to drive and keep in lane Red = PPO Grey = PPO+GAIL @ 0.5 reward strength Demonstrations recorded by PPO (red) model with deterministic inference Any suggestions as to why GAIL is not improving the learning process? I believe it will not exceed the red graph since demonstrations are recorded with the PPO model (red graph)? In this case, it should supposedly improve the learning speed?

Could this be a bug?

Screenshots

Environment

Unity Version: 2020.3.11
OS + version: Ubuntu 18.04
ML-Agents version: ML-Agents 2.1.0 (main branch)
Torch version: 1.8.2

miguelalonsojr commented 2 years ago

Seems like you're using a custom environment. We are unable to help reproduce bugs with custom environments. Please attempt to reproduce your issue with one of the example environments, or provide a minimal patch to one of the environments needed to reproduce the issue.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.