muupan / async-rl

Replicating "Asynchronous Methods for Deep Reinforcement Learning" (http://arxiv.org/abs/1602.01783)
MIT License
401 stars 83 forks source link

t_max = 1000 , loss normalization #19

Closed etienne87 closed 7 years ago

etienne87 commented 8 years ago

Hello,

I have stability issue when increasing t_max (i am trying to learn with torcs racing game where possibly t_max=5 might be too small ) In a3c.py, it seems that total_loss is not normalized by number of frames. Is this normal? Is it the reason why you need to call the GradientClipping optimization hook?

etienne87 commented 8 years ago

@muupan Nevermind, I suppose t_max is not supposed to be that big. Nonetheless I wonder if anyone tried to accumulate gradient in the shared model instead of zeroing previous ones right away? Would it make sense to accumulate from different threads before doing an update?