Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
16.93k stars 4.13k forks source link

Pretraining/Behavioral Cloning doesn't seem to do anything with SAC trainer #3193

Closed niskander closed 4 years ago

niskander commented 4 years ago

Hi, I'm wondering if Behavioral Cloning is supposed to work with a SAC trainer? The docs only mention that it can be used with PPO, but I took a quick glance at the code and it looks like the "behavioral_cloning" option is at least being parsed by the SAC trainer, so I tried it, but it doesn't seem to do anything, e.g. the Cloning-specific graph doesn't appear in tensorboard. Is it supposed to work?

xiaomaogy commented 4 years ago

Yes it should work. You can refer to the pyramids environment here: https://github.com/Unity-Technologies/ml-agents/blob/master/config/sac_trainer_config.yaml

niskander commented 4 years ago

@xiaomaogy That's GAIL not behavioral cloning

xiaomaogy commented 4 years ago

We've deprecated BC, and we are push everyone to use GAIL.

mnsmuts commented 4 years ago

In case it helps, I think also you have to effectively disable Extrinsic rewards by setting their strength to 0. Extrinsic rewards are enabled by default, as a result I have found that you have to include them in your reward config but set them to 0.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had activity in the last 14 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically closed because it has not had activity in the last 28 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.