RuntimeError: Expected object of backend CPU but got backend CUDA for argument #2 'other'

fabrahman commented 6 years ago

Hello,

Thank you for sharing your code.

I tried running the code (train.py) on a dataset (both source and target are English) and I am getting the following error in Batch.py.

loading spacy tokenizers...
creating dataset and iterator...
The `device` argument should be set by using `torch.device` or passing a string as an argument. This behavior will be deprecated soon and currently defaults to cpu.
training model...
cudam: epoch 1 [                    ]  0%  loss = ...
Traceback (most recent call last):
  File "train.py", line 183, in <module>
    main()
  File "train.py", line 111, in main
    train_model(model, opt)
  File "train.py", line 34, in train_model
    src_mask, trg_mask = create_masks(src, trg_input, opt)
  File "/home/hannahbrahman/ROCstories/Transformer/Batch.py", line 26, in create_masks
    trg_mask = trg_mask & np_mask
RuntimeError: Expected object of backend CPU but got backend CUDA for argument #2 'other'

I tried to add device= torch.device('cuda' if torch.cuda.is_available() else 'cpu') and np_mask.to(device) but still getting the same error. I git cloned your repo on my own machine and tried running train.py. My pytorch version is: 1.0.0.dev20181105

fmobrj commented 5 years ago

Hello. Great project.

I'm having the same issue as Hannabrahman. Exactly the same message.

I've tried pytorch 4.0 and 4.1 and cant get rid of this message.

fabrahman commented 5 years ago

@fmobrj Hi, try adding .to('cuda') or .cuda() at the end lines 31 , 32 of train.py. that is:

src = batch.src.transpose(0,1).to('cuda') trg = batch.trg.transpose(0,1).to('cuda')

That worked for me. Good luck

fmobrj commented 5 years ago

Hi, Hannabrahman! Thank you very much! Now it is working fine.

I am evaluating some Transformer projects, so I can fine tune my own, to my needs. It seems, by the folder structure, that most of them are forks from the OpenNmt project,

Best regards!

SamLynnEvans / Transformer

RuntimeError: Expected object of backend CPU but got backend CUDA for argument #2 'other' #2