pytorch / tutorials

PyTorch tutorials.
https://pytorch.org/tutorials/
BSD 3-Clause "New" or "Revised" License
8.27k stars 4.07k forks source link

Errors in DQN tutorial #194

Closed dusty-nv closed 6 years ago

dusty-nv commented 6 years ago

When running reinforcement_q_learning.py from DQN tutorial against pyTorch master, the program crashes with errors:

[2018-01-16 18:27:56,613] Making new env: CartPole-v0
/usr/local/lib/python2.7/dist-packages/torchvision-0.2.0-py2.7.egg/torchvision/transforms/transforms.py:176: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
reinforcement_q_learning.py:335: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  Variable(state, volatile=True).type(FloatTensor)).data.max(1)[1].view(1, 1)
/usr/lib/python2.7/dist-packages/matplotlib/backend_bases.py:2437: MatplotlibDeprecationWarning: Using default event loop until function specific to this GUI is implemented
  warnings.warn(str, mplDeprecation)
reinforcement_q_learning.py:398: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  volatile=True)
reinforcement_q_learning.py:413: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  next_state_values.volatile = False
Traceback (most recent call last):
  File "reinforcement_q_learning.py", line 466, in <module>
    optimize_model()
  File "reinforcement_q_learning.py", line 418, in optimize_model
    loss = F.smooth_l1_loss(state_action_values, expected_state_action_values)
RuntimeError: the derivative for 'target' is not implemented

Any known workarounds/updates?

Wimsen commented 6 years ago

I think it could be a problem when the states are produced by a different model. I got it working by replacing the line next_state_values[non_final_mask] = model(non_final_next_states).max(1)[0] with next_state_values[non_final_mask] = model(non_final_next_states).detach().max(1)[0]

This seems to remove any references to the model that originally produced the state.