When I used fast_cross_entropy_loss instead of torch.nn.CrossEntropyLoss, this error happend.
File "/mnt/fs/user/xingjinliang/unsloth/unsloth/kernels/cross_entropy_loss.py", line 318, in fast_cross_entropy_loss loss = Fast_CrossEntropyLoss.apply( File "/usr/local/lib/python3.10/site-packages/torch/autograd/function.py", line 539, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/mnt/fs/user/xingjinliang/unsloth/unsloth/kernels/cross_entropy_loss.py", line 272, in forward losses.masked_fill_(labels == -100, 0) # Don't forget to mask padding out! RuntimeError: expected self and mask to be on the same device, but got mask on cuda:7 and self on cuda:0
Why device = "cuda: 0" in losses = torch.empty(n_rows, dtype = torch.float32, device = "cuda: 0") and logsumexp = torch.empty(n_rows, dtype = torch.float32, device = "cuda: 0") and logsumexp = torch.empty((n_rows, n_chunks,), dtype = torch.float32, device = "cuda: 0") of kernels/cross_entropy_loss.py. I think device = logits.device is correct, after I change it there is no error. Check it please!
When I used fast_cross_entropy_loss instead of torch.nn.CrossEntropyLoss, this error happend.
File "/mnt/fs/user/xingjinliang/unsloth/unsloth/kernels/cross_entropy_loss.py", line 318, in fast_cross_entropy_loss loss = Fast_CrossEntropyLoss.apply( File "/usr/local/lib/python3.10/site-packages/torch/autograd/function.py", line 539, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/mnt/fs/user/xingjinliang/unsloth/unsloth/kernels/cross_entropy_loss.py", line 272, in forward losses.masked_fill_(labels == -100, 0) # Don't forget to mask padding out! RuntimeError: expected self and mask to be on the same device, but got mask on cuda:7 and self on cuda:0
Why
device = "cuda: 0"
inlosses = torch.empty(n_rows, dtype = torch.float32, device = "cuda: 0")
andlogsumexp = torch.empty(n_rows, dtype = torch.float32, device = "cuda: 0")
andlogsumexp = torch.empty((n_rows, n_chunks,), dtype = torch.float32, device = "cuda: 0")
ofkernels/cross_entropy_loss.py
. I thinkdevice = logits.device
is correct, after I change it there is no error. Check it please!