thebes2 / RL

1 stars 0 forks source link

Multistep returns #18

Open thebes2 opened 2 years ago

thebes2 commented 2 years ago

The current way multistep is handled doesn't make sense as the model only has access to the agent's actions in the first step, and asked to predict the return multiple steps into the future. Try incorporating all actions in the window and unrolling somehow to improve.