Hi, @kaymen99 , thanks for your code. Is there anything wrong with the her_augmentation function under HER.py file where the re-computed reward is always zero?
And why should we take the future observation as the augmented observation, shouldn't we keep the observation in the current timestep, i.e., obs, _, _ = obs_array[index].values() as the augmented observation?
Hi, @kaymen99 , thanks for your code. Is there anything wrong with the
her_augmentation
function underHER.py
file where the re-computed reward is always zero?And why should we take the future observation as the augmented observation, shouldn't we keep the observation in the current timestep, i.e.,
obs, _, _ = obs_array[index].values()
as the augmented observation?