nrontsis / PILCO

Bayesian Reinforcement Learning in Tensorflow
MIT License
314 stars 84 forks source link

Is squash_sin() right? #62

Closed fuku10 closed 2 years ago

fuku10 commented 2 years ago

Hello, In squash_sin(), M seems E[u_max sin(pi_tilde)], not E[u_max (9/8 sin(pi_tilde) + 1/8 sin(3 pi_tilde))].

fuku10 commented 2 years ago

Now I understand! 9/8 sin(x) + 1/8 sin(3x) is used in "Gaussian Processes for Data-Efficient Learning in Robotics and Control," and sin(x) is used in "Efficient reinforcement learning using Gaussian processes." And this code adopt sin(x). Thanks!