Closed rajatpaliwal closed 5 years ago
Hi @rajatpaliwal
It sounds like you'd like to do online imitation learning. We have documentation on this here: https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Imitation-Learning.md#online-training
Please let us know i this works for you.
Thanks @awjuliani for such prompt reply. I have two question - 1) To implement this method do I need to create two separate agents map them separately to the teacher and student brain? 2). While creating the agent do I need to setup reward for the agent or the simple mapping of the agent action to the keyboard buttons will do the job?
Hi @rajatpaliwal
In this case you would use two separate agents with separate brains. You don't need to set up rewards, if you only plan to use imitation learning.
As a note, you can check out the "BananaIL" scene which demonstrates online imitation learning: https://github.com/Unity-Technologies/ml-agents/blob/master/UnitySDK/Assets/ML-Agents/Examples/BananaCollectors/Scenes/BananaIL.unity
Thanks @awjuliani . Much appreciate your suggestions.
Hi @awjuliani I am trying to perform imitation learning in my custom made environment. But, I am not able to edit the discrete inputs for my actions in player brain. Any thoughts.
Sorry for the bother @awjuliani . It was some problem with my Unity . Restarting the project solved it.
Hi @awjuliani , While trying to perform online imitation learning in my custom made environment, as I give training command (" mlagents-learn config/online_bc_config.yaml --train --slow") in the command prompt it prompts me to press the play button. Since my environment is heavy it takes some time to start after pressing the play button , in the meantime command prompt gives me the error" The Unity environment took too long to respond". Any suggestion on increasing the wait time of the training command so that I can start the training of the environment.
Hi @awjuliani , Does the student agent learns from scratch in a particular iteration of training or does it also retains some learning from previous iteration of training while learning from current iteration of training.
Hi @rajatpaliwal
The student agent continues to learn from data collected in the past as well as data being immediately collected in each training iteration.
Hi @awjuliani , After each iteration of training a separate .nn file is created which is imported into the learning brain after the training is done. Can you elaborate how student agent continues to learn from data collected in the past when a separate .nn file is being created with every iteration. Ideally student agent should learn from scratch from the current iteration .
Hi @rajatpaliwal
The .nn
file is the final product of learning. It does not change once created.
Hi @awjuliani . I absolutely agree that .nn file is the final product of learning. That is why I was confused how student agent which is controlled by the learning brain continues to learn from data collected in the past. I think what you mean by that statement is using RNN's to retain previous learning or using command line option --load for loading the parameters of previously trained brain. Kindly correct if I am wrong.
This issue has been automatically marked as stale because it has not had activity in the last 14 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 28 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
In my custom made environment I have mapped the agent movement to keyboard keys . I want to use the broadcast feature to collect data generated by Player Brain game sessions and use this data to train an agent in a supervised context. Can some provide me the steps to perform above mentioned action?