Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
16.89k stars 4.12k forks source link

GAIL+PPO+BC Suddenly Perform Worse #5765

Closed AbhijitBaruah closed 11 months ago

AbhijitBaruah commented 2 years ago

I am currently training my agent to do two things :- 1) Collect And Deposit Food 2) Take shelter on occurrence of certain things in the environment. I am using gail and Behavioral Cloning with PPO to speed up training. I observed that no matter what reward function I use, during the initial parts of training the Agent learns exactly what to do and does it pretty successfully, but all of a sudden the agent's "stop" doing those actions and diverge on to meaningless states / they get stuck in a local maxima and cannot maximize the reward function. I am super confused as to why this may be happening. Any help would be greatly appreciated.

My training parameters :- image

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 1 year ago

This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.

miguelalonsojr commented 1 year ago

We are unable to help reproduce bugs with custom environments. Please attempt to reproduce your issue with one of the example environments, or provide a minimal patch to one of the environments needed to reproduce the issue.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 30 days if no further activity occurs. Thank you for your contributions.

Arlen0615 commented 1 year ago

@AbhijitBaruah I have same issue in my custom environment, do you have any suggestion if you have experience to made it better?