Hyeokreal / Actor-Critic-Continuous-Keras

Keras Implementation of the continuous control with actor-critic, a3c
13 stars 3 forks source link

a2c_continuous.py total reward stays < 0 #2

Open rafiqhasan opened 5 years ago

rafiqhasan commented 5 years ago

Hello,

I am not sure where I may be going wrong. I just copy pasted the a2c_continuous.py file and even after 3000 episodes the 10 episode average reward has converged from -133 to -2 or something. It doesnt even cross 0 , can you please let me know how did you manage to converge this to +100 in the same number of episodes ?

When I run it, after some time it keeps bouncing between -10 and -2.

I also tried the a3c_continuous.py and same happens there as well.

Thanks