issues
search
Roboy
/
gym-roboy
Pip package gym environment for deep roboy control
2
stars
1
forks
source link
Exp reward
#25
Closed
tomasruizt
closed
5 years ago
tomasruizt
commented
5 years ago
Reward is exponentiated, which keeps high reward gradients when close to the goal.
Now we have a test to catch the cases where the agent would reach an optimum by not moving at all. The test is similar to checking numerical gradients.