Clipping target q values

TianhongDai / hindsight-experience-replay

This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.

MIT License

402 stars 75 forks source link

Clipping target q values #2

Closed Medabid1 closed 5 years ago

Medabid1 commented 5 years ago

Hello, I would like to ask question about clipping target q values to just negative numbers in : https://github.com/TianhongDai/hindsight-experience-replay/blob/master/ddpg_agent.py#L216

Is it due to the fact that the reward is always less than zero, thus the values should be always less than zero ?

Thanks in advance !

TianhongDai commented 5 years ago

@Medabid1 Hi - you're right, because the reward for the fetch environment only have -1 (failed) or 0 (success). Therefore, the return will never be a positive value.