about starting a new game and History

devsisters / DQN-tensorflow

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

MIT License

2.46k stars 765 forks source link

in dqn/agent.py line 59

  if terminal:
    screen, reward, action, terminal = self.env.new_random_game()

when starting a new game due to a terminal state.

why we don't need to reset the self.history?

because it would affect the next iteration.

  # 1. predict
  action = self.predict(self.history.get())
  # 2. act
  screen, reward, terminal = self.env.act(action, is_training=True)
  # 3. observe
  self.observe(screen, reward, action, terminal)

the predicted action for self.history.get() is not depending on the current game screens, it will predict action for the previous game screen, which is ended, instead.

Do I miss anything?

Thank you very much.

devsisters / DQN-tensorflow

about starting a new game and History #59