ShangtongZhang / DeepRL

Modularized Implementation of Deep RL Algorithms in PyTorch
MIT License
3.21k stars 684 forks source link

Adding elements to the Transition namedtuple #99

Closed Louis-Bagot closed 4 years ago

Louis-Bagot commented 4 years ago

Hello, Thank you very much for your help before, I managed to get the Async DQN working with your docker. I am now transitioning to the new API, but the new Replay abstraction is making something harder. So, I am trying to implement some Intrinsic Motivation baselines. In order to do this, I need to add a new component, say reward_i, to the namedtuple Transition. This is not specific to IM, and in general, I would like to easily control what a Transition contains, depending on the setting. In the previous API, I could just add the reward_i variable in the list before passing the experiences list to the replay memory.

Is there an easy and efficient way to control the transition now? If possible, I would like to avoid having to rewrite an instance of Uniform/PrioritizedReplay everytime I need to add a component to the Transition. I see I can add a key to the Storage object, but it doesn't seem like it will appear in the Transition in the sample method.

Thanks for any help! And thanks again for your help before, your repo has been a huge help in my research.

ShangtongZhang commented 4 years ago

Thanks for your kind words! I don't know what the best solution is. But if I'm doing this I will redefine Transition.

Louis-Bagot commented 4 years ago

Thank you! But redefining Transition alone will not be enough since the Replay classes also need to be aware of this, right? Or is there some way I could be using only the TransitionCLS variable? (although it isn't used in the Replays code, only in the ReplayWrapper) I was thinking, do you think it could work to have, say, an info entry to the Transition, containing a dictionary in which we can store whichever additional data we want? Thanks again for your help. I think I can already close the issue.

ShangtongZhang commented 4 years ago

That's a wonderful solution! I'll include it for the next revision. Thanks a lot!