TianhongDai / esil-hindsight

This is the official code of our paper "Episodic Self-Imitation Learning with Hindsight" [Electronics 2020].
MIT License
7 stars 2 forks source link

trajectory selection module #2

Closed quyouyuan closed 3 years ago

quyouyuan commented 3 years ago

Hello! I'm very interested in your research. I think it's very helpful for me. I saw the trajectory selection module in the paper "Episodic Self-Imitation Learning with Hindsight", but I didn't find it in the code. Would you please help me find out where the code reflects this? I'm very sorry to delay your time due to my ability to understand the code,thanks

TianhongDai commented 3 years ago

@quyouyuan Hi, the trajectory selection module in Eq.(10) is in: https://github.com/TianhongDai/esil-hindsight/blob/main/rl_base/ppo_agent.py#L168; the calculations of corresponding returns are in: https://github.com/TianhongDai/esil-hindsight/blob/main/rl_base/ppo_agent.py#L119-L134.

quyouyuan commented 3 years ago

Thank you for your reply