Open ryotatomioka opened 1 year ago
While, I do not have this problem:
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torch_directml
>>> dml = torch_directml.device()
>>> tensor1 = torch.tensor([1]).to(dml)
>>> print(tensor1)
tensor([1], device='privateuseone:0')
My GPU is Nvdia's. Maybe you can try to upgrade torch-directml to version 0.1.13.1.dev230119 and upgrade your GPU driver.
I noticed you have installed the nvidia-XXX, maybe you can try "pip install torchvision" or "conda install torchvision -c pytorch" after "conda install pytorch cpuonly -c pytorch".
Hi @ryotatomioka, Thanks for reporting this. Can you provide us your driver version?
Hi @ryotatomioka, Thanks for reporting this. Can you provide us your driver version?
What do you mean, "driver version"? Do you refer to the graphic cards' driver version?
My GPU is AMD RX 6700xt
Driver's version is 22.5.1
versions:
Hi @Looong01, Thanks for clarifying my question, And yes, I was asking about the GPU driver version.
I missed the info(Graphics: Intel Iris Xe Graphics (driver version 27.20.100.9268)) in the original post.
Please try updating the GPU driver version. And it's most likely we don't support this GPU because pytorch-directml leverages [DirectML] which has the requirements for Intel GPUs
You can verify that by running:
import torch
import torch_directml
torch_directml.is_available()
torch_directml.device_count()
torch_directml.is_available() returns False or torch_directml.device_count() returns 0 would mean that either the GPU or GPU driver is not supported.
Finding the same error here with a Radeon RX 6700XT.
$ neofetch
.-/+oossssoo+/-. xxxx@xxxxxxxx
`:+ssssssssssssssssss+:` ---------------
-+ssssssssssssssssssyyssss+- OS: Ubuntu 22.04.1 LTS on Windows 10 x86_64
.ossssssssssssssssssdMMMNysssso. Kernel: 5.15.90.1-microsoft-standard-WSL2
/ssssssssssshdmmNNmmyNMMMMhssssss/ Uptime: 1 hour, 37 mins
+ssssssssshmydMMMMMMMNddddyssssssss+ Packages: 1250 (dpkg)
/sssssssshNMMMyhhyyyyhmNMMMNhssssssss/ Shell: bash 5.1.16
.ssssssssdMMMNhsssssssssshNMMMdssssssss. Theme: Adwaita [GTK3]
+sssshhhyNMMNyssssssssssssyNMMMysssssss+ Icons: Adwaita [GTK3]
ossyNMMMNyMMhsssssssssssssshmmmhssssssso Terminal: Windows Terminal
ossyNMMMNyMMhsssssssssssssshmmmhssssssso CPU: 12th Gen Intel i9-12900K (24) @ 3.187GHz
+sssshhhyNMMNyssssssssssssyNMMMysssssss+ GPU: ea92:00:00.0 Microsoft Corporation Device 008e
.ssssssssdMMMNhsssssssssshNMMMdssssssss. Memory: 2751MiB / 15887MiB
/sssssssshNMMMyhhyyyyhdNMMMNhssssssss/
+sssssssssdmydMMMMMMMMddddyssssssss+
/ssssssssssshdmNNNNmyNMMMMhssssss/
.ossssssssssssssssssdMMMNysssso.
-+sssssssssssssssssyyyssss+-
`:+ssssssssssssssssss+:`
.-/+oossssoo+/-.
❯ wsl --version
WSL version: 1.2.5.0
Kernel version: 5.15.90.1
WSLg version: 1.0.51
MSRDC version: 1.2.3770
Direct3D version: 1.608.2-61064218
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22621.1702
$ python
Python 3.10.6 (main, Nov 2 2022, 18:53:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torch_directml
>>> torch_directml.is_available()
True
>>> torch_directml.device_count()
2
$ pip freeze
certifi==2022.12.7
charset-normalizer==2.1.1
filelock==3.9.0
idna==3.4
Jinja2==3.1.2
MarkupSafe==2.1.2
mpmath==1.2.1
networkx==3.0
numpy==1.24.1
Pillow==9.3.0
requests==2.28.1
sympy==1.11.1
torch==2.0.0+cpu
torch-directml==0.2.0.dev230426
torchaudio==2.0.0+cpu
torchvision==0.15.1
typing_extensions==4.4.0
urllib3==1.26.13
$ python
Python 3.10.6 (main, Nov 2 2022, 18:53:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torch_directml
>>> dml = torch_directml.device()
>>> tensor1 = torch.tensor([1]).to(dml)
Segmentation fault
I also can't get an example to work
# I have already installed the requirements.txt file and ran dataset.py
$ python PyTorch/1.8/resnet50/train.py
/home/xxxx/venv/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
warn(
Traceback (most recent call last):
File "/home/xxxx/DirectML/PyTorch/1.8/resnet50/train.py", line 35, in <module>
main()
File "/home/xxxx/DirectML/PyTorch/1.8/resnet50/train.py", line 30, in main
train(args.path, args.batch_size, args.epochs, args.learning_rate,
File "/home/xxxx/DirectML/PyTorch/1.8/classification/train_classification.py", line 111, in main
model = get_model(model_str, device)
File "/home/xxxx/DirectML/PyTorch/1.8/classification/test_classification.py", line 76, in get_model
model = models.resnet50(num_classes=10).to(device)
File "/home/xxxx/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1126, in to
device, dtype, non_blocking, convert_to_format = torch._C._nn._parse_to(*args, **kwargs)
RuntimeError: Expected one of cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, fpga, ort, xla, lazy, vulkan, mps, meta, hpu, mtia, privateuseone device type at start of device string: dml
Is there a stable, recommended version of torch-directml, and where if any is that in a requirments.txt file?
I followed the instructions on this page (Enable PyTorch with DirectML on WSL 2) and got a segmentation fault.
Device details: Suface Laptop 4 Windows 11 Enterprise (version: 10.0.22621 Build 226221) CPU: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz 3.00 GHz Memory: 32.0 GB (31.8 GB usable) Graphics: Intel Iris Xe Graphics (driver version 27.20.100.9268)
WSL details:
Python environment details: Python 3.10 torch==1.13.1 torch-directml==0.1.13.dev221216
pip list: