Why Normalization of vf

Hello,

thanks for the code, while I tried to re-implement the program, I find that there is one step to normalize value function vf here . It's implementated by v_predict = v(s; \theta) * (1-/gamma) and critic update is implemented by min_\theta [v(s; \theta) * (1-/gamma) - v_estimate ]^2.

Is there any reason to normalize Value functions output, I tested to remove the normalization term and rescaled learning rate(by 1-gamma), looks there is no problem in HalfCheetah-v2.