originholic / a3c_vrep

2 stars 4 forks source link

Have you ever tried "Pendulum-V0"? #3

Open yanpanlau opened 8 years ago

yanpanlau commented 8 years ago

Thanks for the nice code. I am trying to re-produce the result in "Pendulum-V0" using a3c_cont.py but it seems the model fail to converge. I have tried various method like experience reply but still not working. It would be nice if you can test it out and we can discuss together Cheers.

originholic commented 8 years ago

Hi @yanpanlau , thanks for trying out the code. Unfortunately, I didn't actually test it with gym's Pendulum-v0, since this repo is very experimental for testing out my "batch" method.

If you are interested in getting the continuous actions to work, it is better to use other frameworks like miyosuda/async_deep_reinforce or coreylynch/async-rl. And just change the loss function for the policy and models as mentioned in the deepmind's async paper. Many thanks