Problem description

Per Andrychowicz, et al. (2021) and anecdotal evidence, value function clipping is not useful. Hence we should remove the following code.

https://github.com/vwxyzjn/cleanrl/blob/94a685de9290435623d7cf5e4e770418ddb10a4f/cleanrl/ppo.py#L283-L291

We should do it with great care - conducting benchmark experiments confirming this removal results in the same or better performance in the games we test. That is, we should re-run the following and confirms the performance is ok.

https://github.com/vwxyzjn/cleanrl/blob/94a685de9290435623d7cf5e4e770418ddb10a4f/benchmark/ppo.sh#L1-L59

vwxyzjn / cleanrl

Remove the value function clipping #208

Problem description