Closed jingxixu closed 3 years ago
I am not sure why you detach the state values when computing the advantage functions? Specifically, I am talking about
advantages = rewards - state_values.detach()
Many thanks!
refer to #29
I am not sure why you detach the state values when computing the advantage functions? Specifically, I am talking about
Many thanks!