intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Apache License 2.0
1.56k stars 237 forks source link

No XPU devices found #712

Open Bhargav230m opened 1 day ago

Bhargav230m commented 1 day ago

Describe the bug

import torch
import intel_extension_for_pytorch as ipex

device = "xpu:0"
tensor = torch.randn(3, 3).to(device)

print(tensor)
print(f"Tensor on device: {tensor.device}")

Error:

(ml) techpowerb@ruby:~$ python test_xpu.py
/home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/intel_extension_for_pytorch/xpu/lazy_init.py:80: UserWarning: XPU Device count is zero! (Triggered internally at /build/intel-pytorch-extension/csrc/gpu/runtime/Device.cpp:127.)
  _C._initExtension()
terminate called after throwing an instance of 'c10::Error'
  what():  dpcppSetDevice: device_id is out of range
Exception raised from dpcppSetDevice at /build/intel-pytorch-extension/csrc/gpu/runtime/Device.cpp:167 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x99 (0x7f5f75ea4a89 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0x6a (0x7f5f75e5e2e8 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #2: xpu::dpcpp::dpcppSetDevice(signed char) + 0x114 (0x7f5ecc24f674 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so)
frame #3: xpu::dpcpp::set_device(signed char) + 0x20 (0x7f5ecc1c46d0 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so)
frame #4: xpu::dpcpp::impl::DPCPPGuardImpl::uncheckedSetDevice(c10::Device) const + 0xd (0x7f5ecc1c856d in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so)
frame #5: at::AtenIpexTypeXPU::resize_impl(c10::TensorImpl*, c10::ArrayRef<long>, c10::optional<c10::ArrayRef<long> >, bool) + 0xc2f (0x7f5ecc24041f in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so)
frame #6: at::AtenIpexTypeXPU::impl::empty_strided_dpcpp(c10::ArrayRef<long>, c10::ArrayRef<long>, c10::TensorOptions const&) + 0xc6 (0x7f5ed7e336d6 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so)
frame #7: at::AtenIpexTypeXPU::empty_strided(c10::ArrayRef<long>, c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0xec (0x7f5ed7e412fc in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so)
frame #8: <unknown function> + 0x3c475cc (0x7f5ecc2c15cc in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so)
frame #9: at::_ops::empty_strided::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0xf8 (0x7f5f78107e58 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #10: <unknown function> + 0x255e1ed (0x7f5f784551ed in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #11: at::_ops::empty_strided::call(c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0x1a6 (0x7f5f7814fd46 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #12: <unknown function> + 0x1768fc0 (0x7f5f7765ffc0 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #13: at::native::_to_copy(at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) + 0x1484 (0x7f5f77967c34 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #14: <unknown function> + 0x26f2cdd (0x7f5f785e9cdd in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) + 0xf8 (0x7f5f77ddccb8 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #16: <unknown function> + 0x255b501 (0x7f5f78452501 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #17: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) + 0xf8 (0x7f5f77ddccb8 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #18: <unknown function> + 0x3a4be35 (0x7f5f79942e35 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #19: <unknown function> + 0x3a4c350 (0x7f5f79943350 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #20: at::_ops::_to_copy::call(at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) + 0x1e5 (0x7f5f77e7c855 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #21: at::native::to(at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, bool, c10::optional<c10::MemoryFormat>) + 0x104 (0x7f5f7795c0e4 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #22: <unknown function> + 0x2876243 (0x7f5f7876d243 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #23: at::_ops::to_dtype_layout::call(at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, bool, c10::optional<c10::MemoryFormat>) + 0x1fa (0x7f5f77ff39ea in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #24: <unknown function> + 0x3ddac9 (0x7f5f8a8c1ac9 in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_python.so)
frame #25: <unknown function> + 0x3fc81c (0x7f5f8a8e081c in /home/techpowerb/miniconda3/envs/ml/lib/python3.9/site-packages/torch/lib/libtorch_python.so)
frame #26: python() [0x4f9b46]
<omitting python frames>
frame #28: python() [0x4e69da]
frame #32: python() [0x5c1157]
frame #33: python() [0x5bd170]
frame #34: python() [0x456423]
frame #38: <unknown function> + 0x29d90 (0x7f5f8bb20d90 in /lib/x86_64-linux-gnu/libc.so.6)
frame #39: __libc_start_main + 0x80 (0x7f5f8bb20e40 in /lib/x86_64-linux-gnu/libc.so.6)
frame #40: python() [0x58784e]

Aborted

Why is it returning no XPU devices? I have Iris Xe Graphics with a CPU i5 1135G7

I have followed all the installation steps here: https://intel.github.io/intel-extension-for-pytorch/#installation?platform=gpu&version=v2.1.40%2bxpu&os=linux%2fwsl2&package=pip

Versions

(ml) techpowerb@ruby:~$ python collect_env.py Collecting environment information... PyTorch version: 2.1.0.post3+cxx11.abi PyTorch CXX11 ABI: Yes IPEX version: 2.1.40+xpu IPEX commit: 80ed47655 Build type: Release

OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: N/A IGC version: 2024.2.1 (2024.2.1.20240711) CMake version: N/A Libc version: glibc-2.35

Python version: 3.9.19 (main, May 6 2024, 19:43:03) [GCC 11.2.0] (64-bit runtime) Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Is XPU available: False DPCPP runtime version: 2024.2 MKL version: 2024.2 GPU models and configuration:

Intel OpenCL ICD version: 23.17.26241.33-647~22.04 Level Zero version: 1.3.26241.33-647~22.04

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz CPU family: 6 Model: 140 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 1 BogoMIPS: 4838.39 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear flush_l1d arch_capabilities Virtualization: VT-x Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 192 KiB (4 instances) L1i cache: 128 KiB (4 instances) L2 cache: 5 MiB (4 instances) L3 cache: 8 MiB (1 instance) Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Mitigation; Enhanced IBRS Vulnerability Spec rstack overflow: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected

Versions of relevant libraries: [pip3] intel_extension_for_pytorch==2.1.40+xpu [pip3] numpy==1.26.4 [pip3] torch==2.1.0.post3+cxx11.abi [pip3] torchaudio==2.1.0.post3+cxx11.abi [pip3] torchvision==0.16.0.post3+cxx11.abi [conda] intel-extension-for-pytorch 2.1.40+xpu pypi_0 pypi [conda] numpy 1.26.4 pypi_0 pypi [conda] torch 2.1.0.post3+cxx11.abi pypi_0 pypi [conda] torchaudio 2.1.0.post3+cxx11.abi pypi_0 pypi [conda] torchvision 0.16.0.post3+cxx11.abi pypi_0 pypi``

Bhargav230m commented 23 hours ago

I fixed this issue by switching to Windows. Before this, I tried it on WSL2 and Open Suse Tumbleweed

But it doesn't work properly, The code below:

import torch
import intel_extension_for_pytorch as ipex

device = "xpu"

tensor = torch.randn(3, 3)
tensor = tensor.to(device)

print(tensor)
print(f"Tensor on device: {tensor.device}")

The error thrown is:

Traceback (most recent call last):
  File "C:\Users\techn\OneDrive\Desktop\maria\test.py", line 9, in <module>
    print(tensor)
  File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor.py", line 431, in __repr__
    return torch._tensor_str._str(self, tensor_contents=tensor_contents)
  File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor_str.py", line 664, in _str
    return _str_intern(self, tensor_contents=tensor_contents)
  File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor_str.py", line 595, in _str_intern
    tensor_str = _tensor_str(self, indent)
  File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor_str.py", line 347, in _tensor_str
    formatter = _Formatter(get_summarized_data(self) if summarize else self)
  File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor_str.py", line 138, in __init__
    tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0)
RuntimeError: The program was built for 1 devices
Build program log for 'Intel(R) Iris(R) Xe Graphics':
 -11 (PI_ERROR_BUILD_PROGRAM_FAILURE)

I think it is able to move the tensor to XPU but fails when I try to retrieve it. Hope anyone helps ASAP with this>

Bhargav230m commented 22 hours ago

I also tried training a dummy linear model and I get the same error:

import torch
import intel_extension_for_pytorch as ipex
import torch.nn as nn
import torch.optim as optim

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.linear = nn.Linear(100, 500)

    def forward(self, x):
        return self.linear(x)

model = SimpleModel().to("xpu:0")

criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

input_data = torch.randn(64, 100).to("xpu:0")
target_data = torch.randn(64, 500).to("xpu:0") 

for epoch in range(10):
    model.train()

    outputs = model(input_data)
    loss = criterion(outputs, target_data)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    print(f'Epoch [{epoch+1}/10], Loss: {loss.item():.4f}')