Closed xlyue92 closed 3 years ago
reinforce.py算法里的train_net部分,怎么会每个episode结束后stop_recording() 再求gradient呢?那全部train完之后theta根本不更新啊
reinforce.py算法里的train_net部分,怎么会每个episode结束后stop_recording() 再求gradient呢?那全部train完之后theta根本不更新啊