Closed bcolloran closed 4 years ago
Oh, and one more question: approximately how many training epochs did it take to begin to see good progress? Thanks!
Thank you for your interest. We have used the default parameters as stated in code.
I am a bit confused about your question related to seed. But, we have used 16 random directions to search for test best policy in ARS and no seed was given at the beginning of training.
We have got really good results from 900 epochs. But, allow your training to go up to 2000 epochs. I believe after approximately 500 epochs the biped starts to show signs of walking.
hi @vinits5, thank you for publishing your work on this! I've been trying to complete BipedalWalker-v2 using a number of techniques, and I'm having trouble reproducing your very nice results on the walker with ARS.
Looking through your code, my best guess is that you used the default parameters:
for your successful training run, but I want to make sure that is correct, and that you didn't supply a different set of parameters from the command line for your successful run.
Also, how many random seeds did you have to try before you achieved a successful training? And how many episodes did you have to run (I don't want to stop too early if it looks like I'm stuck in a local maximum but I just need to train longer).
I think I probably just have a bug in my version of the code, so I thought I'd check with you to rule out these factors.
Thanks again for sharing your code and results! Very helpful to other folks like me who are trying to learn! :-)