MG2033 / A2C

A Clearer and Simpler Synchronous Advantage Actor Critic (A2C) Implementation in TensorFlow
Apache License 2.0
183 stars 37 forks source link

About updating. #15

Open Michi-123 opened 3 years ago

Michi-123 commented 3 years ago

Thank you for publishing your A2C codes. In the updating block, you are using torch de-touch method. And it seems to me as same as stop using no grad method on calculating advantage like my [code](Thank you for publishing your A2C codes. In the updating block, you are using torch de-touch method. And it seems to me as same as stop using no grad method on calculating advantage like my code. But my code doesn't learn at all. Is my idea wrong? Thanks.). But my code doesn't learn at all. Is my idea wrong? Thanks.