hungtuchen / pytorch-dqn

Deep Q-Learning Network in pytorch (not actively maintained)
MIT License
383 stars 109 forks source link

Unmatching size and error #3

Open tegg89 opened 6 years ago

tegg89 commented 6 years ago

Hi, thanks for sharing your wonderful code. But I have met some errors when running it.

  1. Inside the line 197~205 from dqn_learn.py, the size of target_Q_values and that of current_Q_values does not matched well. I have changed to next_max_q = next_max_q.unsqueeze(-1) for correcting sizes. Also I have changed to rew_batch[0] from line 203.

  2. (IMO) After stacking records in replay buffer, queue action does not work properly. I have changed the line 158 to action = select_epilson_greedy_action(Q, recent_observations, t), however different action value has queued.

I am still working these but having troubles. Could you help make them right?

hungtuchen commented 6 years ago

Thanks for your question. But I won't be available for a few days. I will revisit it when I have time. Which pytorch version do you use? I haven't updated to latest version. It might be the problem.

tegg89 commented 6 years ago

@transedward Thanks for your reply. I have tested in Pytorch 0.2.0.post1 (0.2.0.1), Python 3.5.3 with Anaconda and Ubuntu 16.04.

praveen-palanisamy commented 6 years ago

@tegg89 : Checkout #8 . Let us know if it worked/didn't work.