devsisters / DQN-tensorflow

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning
MIT License
2.46k stars 765 forks source link

terminal in agent.py seem not handle properly #60

Open martin6336 opened 5 years ago

martin6336 commented 5 years ago

`

for self.step in tqdm(range(start_step, self.max_step), ncols=70, initial=start_step):

if self.step == self.learn_start:
    num_game, self.update_count, ep_reward = 0, 0, 0.
    total_reward, self.total_loss, self.total_q = 0., 0., 0.
    ep_rewards, actions = [], []

# 1. predict
action = self.predict(self.history.get())
# 2. act
screen, reward, terminal = self.env.act(action, is_training=True)
# 3. observe

self.observe(screen, reward, action, terminal)

if terminal:
    screen, reward, action, terminal = self.env.new_random_game()
    num_game += 1
    ep_rewards.append(ep_reward)
    ep_reward = 0.

` Function train in agent.py may not handle properly when the game is terminated. As the game is terminated, the new screen didn't add into history and memory, self.history isn't get updated. And in the next iteration, action = self.predict(self.history.get()) will be the same, i.e. terminated.

douglasrizzo commented 5 years ago

Maybe your issue is related to #48 and #59