How does 'future_p' correspond to 'reply_k'?

TianhongDai / hindsight-experience-replay

This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.

MIT License

396 stars 75 forks source link

How does 'future_p' correspond to 'reply_k'? #6

Open HeegerGao opened 4 years ago

HeegerGao commented 4 years ago

Hello, I am a little confused about this equation: self.future_p = 1 - (1. / (1 + replay_k)) I think reply_k means that we want to select k transitions in one episode(50 transitions) for computing HER goals, but how dose future_p correspond to this? Can you give some interpretation? Thank you!