Closed LabChameleon closed 4 months ago
Hi @dierkes-j , absolutely feel free to open up a PR and add the coefficient!
Closing due to inactivity. FWIW, I ran a sweep with the value loss coef. and didn't find improvement for a PPO trained quadruped joystick policy
Hi,
Is there a reason why you have chosen not to implement a coefficient like
vf_coef
for weighting the loss of the value function in PPO? As far as I know this is commonly available and can be a useful hyperparameter to tune. Would you be open for a PR adding this?