Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.01k stars 4.15k forks source link

imitation learning question/issue about speed, no-ops, and on-key-down #3926

Closed SurferZergy closed 4 years ago

SurferZergy commented 4 years ago

Hi,

my understanding of imitation learning is you use a player brain and add a DemonstrationRecorder object to the agent. You run it in the IDE and it records a file.

My issue is when I setup a player brain, the time passed are counted as no-op steps, and when I press "W" to move, it moves multiple times instead of 1 time (like a on-key-down). So my data will be very bad with too many no-ops and too many multiple "W"s instead of 1.

(I have timescale set at 1)

To Reproduce Steps to reproduce the behavior:

  1. config "W" to move up in player brain (action 1)
  2. in IDE, on player brain object, add Demonstration Recorder, click Record
  3. hit Play
  4. no-ops happen until max steps and game is over

Environment (please complete the following information):

surfnerd commented 4 years ago

Hi @zergy, The demonstration recorder is recording actions for each step. It is expected that you would record multiple W's if you are holding down the W key. The trainer needs information on a per step basis, not an event based one. If you are taking no actions, then those get recorded as well.

As for no-ops, those are valid recordings as well. You might want your agent to not do anything to avoid hitting a moving obstacle, etc.

If you have questions about the demonstration recorder please bring them to the ML-Agents forum. I don't believe this is a bug and I will close this issue for the time being.

Cheers, Chris

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.