FluxML / Gym.jl

Gym environments in Julia
MIT License
54 stars 19 forks source link

PendulumEnv does not use clamped torque #25

Open darsnack opened 5 years ago

darsnack commented 5 years ago

PendulumEnv is calculating clamped torque, but using unclamped torque in subsequent calculations. i.e. we calculate:

v = clamp.(u, -env.max_torque, env.max_torque)

but we don't use v in any of the following lines, and we use u directly.

tejank10 commented 5 years ago

Thanks for pointing out. I tried changing it and running on the examples from model-zoo, and that has difficulty in learning. The gradients vanish due to use of clamp, maybe that's the reason v was never used. I noticed that without using v, the model still learns to output the values in the given range of torque. I'm experimenting with workarounds to get it working with v.