Open darsnack opened 5 years ago
Thanks for pointing out. I tried changing it and running on the examples from model-zoo, and that has difficulty in learning. The gradients vanish due to use of clamp, maybe that's the reason v
was never used. I noticed that without using v
, the model still learns to output the values in the given range of torque. I'm experimenting with workarounds to get it working with v
.
PendulumEnv
is calculating clamped torque, but using unclamped torque in subsequent calculations. i.e. we calculate:but we don't use
v
in any of the following lines, and we useu
directly.