Closed mxllc closed 1 year ago
Sorry, I found that in the discount_value function in the link below, it uses Generalized Advantage Estimation to calculate the advantage. https://github.com/Denys88/rl_games/blob/990b4782ad0375652af76266a12753cb11d768c6/rl_games/common/a2c_common.py#L536-L537
https://github.com/Denys88/rl_games/blob/990b4782ad0375652af76266a12753cb11d768c6/rl_games/common/a2c_common.py#L721-L722
Why does advantage calculated by _discountvalues in #722? Shouldn't the returns be calculated through the _discountvalues?
The image and link below are from an implementation I saw in another repository about A2C. I'm a bit confused about this. Does anyone know what's going on?"
https://github.com/Skylark0924/Machine-Learning-is-ALL-You-Need/blob/766a50ba07c21f6e9f6c8c48a819f6e075e97b78/RL_Actor_Critic/17Actor_Critic.py#L96