microsoft / DirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
MIT License
2.22k stars 297 forks source link

use transformers RuntimeError: tensor.device().type() == at::DeviceType::PrivateUse1 INTERNAL ASSERT FAILED at #578

Open poo0054 opened 6 months ago

poo0054 commented 6 months ago
import torch
import torch_directml

from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer

dml = torch_directml.device()

#  --------------------------

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
raw_inputs = [
    "I've been waiting for a HuggingFace course my whole life.",
    "I hate this so much!",
]
inputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="pt")
for key, value in inputs.items():
    inputs[key] = value.to(dml)  # 显式地将每个张量移动到指定设备

print(inputs)

#  AutoModelForSequenceClassification
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(checkpoint).to(dml)

outputs = model(**inputs)

predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(predictions)

print(model.config.id2label)

error:

E:\project\directml\.env\Scripts\python.exe E:\project\directml\txt.py 
{'input_ids': tensor([[  101,  1045,  1005,  2310,  2042,  3403,  2005,  1037, 17662, 12172,
          2607,  2026,  2878,  2166,  1012,   102],
        [  101,  1045,  5223,  2023,  2061,  2172,   999,   102,     0,     0,
             0,     0,     0,     0,     0,     0]], device='privateuseone:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]],
       device='privateuseone:0')}
Traceback (most recent call last):
  File "E:\project\directml\txt.py", line 28, in <module>
    outputs = model(**inputs)
  File "E:\project\directml\.env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\project\directml\.env\lib\site-packages\transformers\models\distilbert\modeling_distilbert.py", line 1002, in forward
    distilbert_output = self.distilbert(
  File "E:\project\directml\.env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\project\directml\.env\lib\site-packages\transformers\models\distilbert\modeling_distilbert.py", line 822, in forward
    return self.transformer(
  File "E:\project\directml\.env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\project\directml\.env\lib\site-packages\transformers\models\distilbert\modeling_distilbert.py", line 587, in forward
    layer_outputs = layer_module(
  File "E:\project\directml\.env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\project\directml\.env\lib\site-packages\transformers\models\distilbert\modeling_distilbert.py", line 513, in forward
    sa_output = self.attention(
  File "E:\project\directml\.env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\project\directml\.env\lib\site-packages\transformers\models\distilbert\modeling_distilbert.py", line 245, in forward
    scores = scores.masked_fill(
RuntimeError: tensor.device().type() == at::DeviceType::PrivateUse1 INTERNAL ASSERT FAILED at "D:\\a\\_work\\1\\s\\pytorch-directml-plugin\\torch_directml\\csrc\\dml\\DMLTensor.cpp":31, please report a bug to PyTorch. unbox expects Dml at::Tensor as inputs

version

(.env) PS E:\project\directml> pip list
Package            Version
------------------ ---------------
accelerate         0.29.2
certifi            2024.2.2
charset-normalizer 3.3.2
colorama           0.4.6
filelock           3.13.4
fsspec             2024.3.1
huggingface-hub    0.22.2
idna               3.7
Jinja2             3.1.3
MarkupSafe         2.1.5
mpmath             1.3.0
networkx           3.3
numpy              1.26.4
packaging          24.0
pillow             10.3.0
pip                24.0
psutil             5.9.8
PyYAML             6.0.1
regex              2023.12.25
requests           2.31.0
safetensors        0.4.2
setuptools         65.5.0
sympy              1.12
tokenizers         0.15.2
torch              2.0.0
torch-directml     0.2.0.dev230426
torchvision        0.15.1
tqdm               4.66.2
transformers       4.39.3
typing_extensions  4.11.0
urllib3            2.2.1

I'm a newbie and I don't know what to do now. I checked all the documentation online and couldn't find what to do. Is this a bug or a problem with my device?

But my code below does not report any errors.:


import torch
import torch_directml

dml = torch_directml.device()
print(torch_directml.is_available())
print(dml)

a = torch.randn(3, 3).to(dml)
print(torch.add(a, a))

print:

True
privateuseone:0
tensor([[-0.7370,  3.0596,  0.0153],
        [ 3.2455, -3.7167,  2.4204],
        [ 0.5102, -0.0560, -1.4678]], device='privateuseone:0')
poo0054 commented 6 months ago

my gpu is AMD 6600xt system : windows10

Hakusai-Butyou commented 1 month ago

This text is translated into English using DeepL translation. I don't know if this will solve the problem because the version of torch is different from my environment, but on my environment, rewriting the second line from the end of the error code, the following part, worked on my environment.

Before rewriting.

scores = scores.masked_fill(
            mask, torch.tensor(torch.finfo(scores.dtype).min)
        )  # (bs, n_heads, q_length, k_length)

After rewriting.

scores[mask]=torch.tensor(torch.finfo(scores.dtype).min)

I'm sorry if my explanation is not clear.