Hello John,
After reading your paper on TRPO and view your code on GitHub, I am a little bit confused on steps regarding the prediction of value functions. Here, you concatenate to the observation the time-step.
Why are you doing this? is it mandatory?
Hoping to receive feedback from you.
Regards.
Hello John, After reading your paper on TRPO and view your code on GitHub, I am a little bit confused on steps regarding the prediction of value functions. Here, you concatenate to the observation the time-step. Why are you doing this? is it mandatory? Hoping to receive feedback from you. Regards.