sweetice / Deep-reinforcement-learning-with-pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
MIT License
3.75k stars 837 forks source link

About updating. #26

Open Michi-123 opened 3 years ago

Michi-123 commented 3 years ago

Thank you for publishing your A2C codes. In the updating block, you are using torch de-touch method. And it seems to me as same as stop using no grad method on calculating advantage like my code. But my code doesn't learn at all. Is my idea wrong? Thanks.