Chapter 06 DQN pong training

moayad-hsn commented 4 years ago

Hi, so I faced this error while running the code for training the DQN agent on pong 8589: done 9 games, mean reward -20.444, eps 0.91, speed 124.21 f/s 9518: done 10 games, mean reward -20.400, eps 0.90, speed 121.48 f/s Traceback (most recent call last): File "02_dqn_pong.py", line 169, in loss_t = calc_loss(batch, net, tgt_net, device=device)

File "02_dqn_pong.py", line 96, in calc_loss state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1)

RuntimeError: index 17179869185 is out of bounds for dimension 1 with size 6

I want to know the reason for this indexing error, it happens when I start training the network and I don't have any idea on it's cause

ImGonnaDans commented 4 years ago

Hi, so I faced this error while running the code for training the DQN agent on pong 8589: done 9 games, mean reward -20.444, eps 0.91, speed 124.21 f/s 9518: done 10 games, mean reward -20.400, eps 0.90, speed 121.48 f/s Traceback (most recent call last): File "02_dqn_pong.py", line 169, in loss_t = calc_loss(batch, net, tgt_net, device=device)

File "02_dqn_pong.py", line 96, in calc_loss state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1)

RuntimeError: index 17179869185 is out of bounds for dimension 1 with size 6

I want to know the reason for this indexing error, it happens when I start training the network and I don't have any idea on it's cause

There is no such big action, the correct action range is from 0 to env.action_space.n (which is 5 on Pong, totally 6 actions). So, I think you can check the array action_v. make sure that was the really action array you want to input to the method gather.

DeanReznick commented 4 years ago

Hi, guys,

This error also appears when I use the CPU instead of the GPU. If I use the GPU the error appears:

Traceback (most recent call last): File "...Chapter06/02_dqn_pong.py", line 176, in loss_t = calc_loss(batch, net, tgt_net, device=device) File "...Chapter06/02_dqn_pong.py", line 97, in calc_loss state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1) RuntimeError: Expected object of scalar type Long but got scalar type Int for argument #3 'index' in call to _th_gather_out

If I use '.long()' the speed decreases massively. But the code runs.

And: print(actions_v.shape) -> torch.Size([32])

PacktPublishing / Deep-Reinforcement-Learning-Hands-On

Chapter 06 DQN pong training #77