Closed TMmichi closed 2 years ago
Addressing Function Approximation Error in Actor-Critic Methods - TD3 (2018) https://arxiv.org/pdf/1802.09477.pdf
HIGH-DIMENSIONAL CONTINUOUS CONTROL USING GENERALIZED ADVANTAGE ESTIMATION - GAE (2016) https://arxiv.org/pdf/1506.02438.pdf
Policy invariance under reward transformations: Theory and application to reward shaping (1999) http://luthuli.cs.uiuc.edu/~daf/courses/games/AIpapers/ng99policy.pdf
Proximal Policy Optimization Algorithms - PPO (2017) https://arxiv.org/pdf/1707.06347.pdf