kaymen99 / Robot-arm-control-with-RL

Robot arm control using reinforcement learning algorithms : DDPG and TD3 with hindsight experience replay (HER)
MIT License
35 stars 0 forks source link

rewards always zero #1

Open nuomizai opened 8 months ago

nuomizai commented 8 months ago

Hi, @kaymen99 , thanks for your code. Is there anything wrong with the her_augmentation function under HER.py file where the re-computed reward is always zero?

reward = agent.env.compute_reward(future_achgoal, future_achgoal, 1.0)

And why should we take the future observation as the augmented observation, shouldn't we keep the observation in the current timestep, i.e., obs, _, _ = obs_array[index].values() as the augmented observation?