Closed icaro56 closed 5 years ago
@icaro56 It is used in trainer_controller.py: 262, is_ready_update() function.
@xiaomaogy , only in PPO buffer_size is used inside is_ready_update.
Another hyperparameters that are not used by BC are the beta, epsilon and num_epoch.
Please, mark this issue as bug.
I spent a long time changing values of hyperparameters that were not even used. Correcting this information in the documentation will help others.
@icaro56 Yes you are correct, this is a documentation error on our side. Basically in ppo we will use the buffer to collect a certain number of experiences and train on it, then clear the buffer and recollect the experiences, and in this process the buffer size is a constant number. In behavioral cloning we store all of the training data in the buffer, and the buffer size keeps increasing, and thus we don't have a constant buffer size.
I will update the documentation accordingly. Thanks for your help in pointing this out.
Another hyperparameters that are not used by BC are:
Beta epsilon normalize num_epoch
Hi @icaro56, thanks for raising this issue! There is the new PR that fixes your issue which will be reflected in release-v0.5. https://github.com/Unity-Technologies/ml-agents/pull/1193
Thanks for reaching out to us. Hopefully you were able to resolve your issue. We are closing this due to inactivity, but if you need additional assistance, feel free to reopen the issue.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
In the documentation below, is writen that buffer_size is used in both PPO and BC:
https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-ML-Agents.md
I am study the Behavioral Cloning Trainer and Model Python code and I did not find buffer_size being used.