Open Richardxxxxxxx opened 6 years ago
Yeah, it would affect the next iteration, but it won't do any harm in most of the cases. In many RL environments the concept of episode/game is abstracted away from the agent and all it sees is a continuous flow of millions of frames.
in dqn/agent.py line 59
when starting a new game due to a terminal state.
why we don't need to reset the self.history?
because it would affect the next iteration.
the predicted action for self.history.get() is not depending on the current game screens, it will predict action for the previous game screen, which is ended, instead.
Do I miss anything?
Thank you very much.