ShangtongZhang / DeepRL

Modularized Implementation of Deep RL Algorithms in PyTorch
MIT License
3.21k stars 684 forks source link

Dueling DQN ,The expanded size of the tensor (2) must match the existing size (10) at non-singleton dimension 1 #26

Closed jixian79 closed 6 years ago

jixian79 commented 6 years ago

Ubuntu 16.04 pytorch 0.3.1 run your Dueling DQN, it reports

Traceback (most recent call last):
  File "/home/opencv/PycharmProjects/RL-Pytorch/main.py", line 422, in <module>
    dqn_cart_pole()
  File "/home/opencv/PycharmProjects/RL-Pytorch/main.py", line 29, in dqn_cart_pole
    run_episodes(DQNAgent(config))
  File "/home/opencv/PycharmProjects/RL-Pytorch/utils/misc.py", line 21, in run_episodes
    reward, step = agent.episode()
  File "/home/opencv/PycharmProjects/RL-Pytorch/agent/DQN_agent.py", line 59, in episode
    q_next = self.target_network.predict(next_states, False).detach()  # TD网络计算Q'
  File "/home/opencv/PycharmProjects/RL-Pytorch/network/base_network.py", line 91, in predict
    q = value.expand_as(advantange) + (advantange - advantange.mean(1).expand_as(advantange))
  File "/anaconda3/envs/gymlab/lib/python3.5/site-packages/torch/autograd/variable.py", line 433, in expand_as
    return self.expand(tensor.size())
RuntimeError: The expanded size of the tensor (2) must match the existing size (10) at non-singleton dimension 1. 

But i solved it , in base_network.py ,line 91 change the q = value.expand_as(advantange) + (advantange - advantange.mean(1).expand_as(advantange)) to q = value.expand_as(advantange) + (advantange - advantange.mean(1,keepdim=True).expand_as(advantange))

then it works well

ShangtongZhang commented 6 years ago

Thanks for pointing this out. I missed this issue when I upgrade to pytorch 0.3. I pushed the fix as you suggested.