Multistep returns - Githubissues

The current way multistep is handled doesn't make sense as the model only has access to the agent's actions in the first step, and asked to predict the return multiple steps into the future. Try incorporating all actions in the window and unrolling somehow to improve.

thebes2 / RL

Multistep returns #18