Ubuntu 16.04
pytorch 0.3.1
run your Dueling DQN, it reports
Traceback (most recent call last):
File "/home/opencv/PycharmProjects/RL-Pytorch/main.py", line 422, in <module>
dqn_cart_pole()
File "/home/opencv/PycharmProjects/RL-Pytorch/main.py", line 29, in dqn_cart_pole
run_episodes(DQNAgent(config))
File "/home/opencv/PycharmProjects/RL-Pytorch/utils/misc.py", line 21, in run_episodes
reward, step = agent.episode()
File "/home/opencv/PycharmProjects/RL-Pytorch/agent/DQN_agent.py", line 59, in episode
q_next = self.target_network.predict(next_states, False).detach() # TD网络计算Q'
File "/home/opencv/PycharmProjects/RL-Pytorch/network/base_network.py", line 91, in predict
q = value.expand_as(advantange) + (advantange - advantange.mean(1).expand_as(advantange))
File "/anaconda3/envs/gymlab/lib/python3.5/site-packages/torch/autograd/variable.py", line 433, in expand_as
return self.expand(tensor.size())
RuntimeError: The expanded size of the tensor (2) must match the existing size (10) at non-singleton dimension 1.
But i solved it ,
in base_network.py ,line 91
change the
q = value.expand_as(advantange) + (advantange - advantange.mean(1).expand_as(advantange))
to
q = value.expand_as(advantange) + (advantange - advantange.mean(1,keepdim=True).expand_as(advantange))
Ubuntu 16.04 pytorch 0.3.1 run your Dueling DQN, it reports
But i solved it , in base_network.py ,line 91 change the
q = value.expand_as(advantange) + (advantange - advantange.mean(1).expand_as(advantange))
toq = value.expand_as(advantange) + (advantange - advantange.mean(1,keepdim=True).expand_as(advantange))
then it works well