Open Bhargav230m opened 1 day ago
I fixed this issue by switching to Windows. Before this, I tried it on WSL2 and Open Suse Tumbleweed
But it doesn't work properly, The code below:
import torch
import intel_extension_for_pytorch as ipex
device = "xpu"
tensor = torch.randn(3, 3)
tensor = tensor.to(device)
print(tensor)
print(f"Tensor on device: {tensor.device}")
The error thrown is:
Traceback (most recent call last):
File "C:\Users\techn\OneDrive\Desktop\maria\test.py", line 9, in <module>
print(tensor)
File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor.py", line 431, in __repr__
return torch._tensor_str._str(self, tensor_contents=tensor_contents)
File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor_str.py", line 664, in _str
return _str_intern(self, tensor_contents=tensor_contents)
File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor_str.py", line 595, in _str_intern
tensor_str = _tensor_str(self, indent)
File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor_str.py", line 347, in _tensor_str
formatter = _Formatter(get_summarized_data(self) if summarize else self)
File "C:\Users\techn\miniconda3\envs\ml-xpu\lib\site-packages\torch\_tensor_str.py", line 138, in __init__
tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0)
RuntimeError: The program was built for 1 devices
Build program log for 'Intel(R) Iris(R) Xe Graphics':
-11 (PI_ERROR_BUILD_PROGRAM_FAILURE)
I think it is able to move the tensor to XPU but fails when I try to retrieve it. Hope anyone helps ASAP with this>
I also tried training a dummy linear model and I get the same error:
import torch
import intel_extension_for_pytorch as ipex
import torch.nn as nn
import torch.optim as optim
class SimpleModel(nn.Module):
def __init__(self):
super(SimpleModel, self).__init__()
self.linear = nn.Linear(100, 500)
def forward(self, x):
return self.linear(x)
model = SimpleModel().to("xpu:0")
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
input_data = torch.randn(64, 100).to("xpu:0")
target_data = torch.randn(64, 500).to("xpu:0")
for epoch in range(10):
model.train()
outputs = model(input_data)
loss = criterion(outputs, target_data)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(f'Epoch [{epoch+1}/10], Loss: {loss.item():.4f}')
Describe the bug
Error:
Why is it returning no XPU devices? I have Iris Xe Graphics with a CPU i5 1135G7
I have followed all the installation steps here: https://intel.github.io/intel-extension-for-pytorch/#installation?platform=gpu&version=v2.1.40%2bxpu&os=linux%2fwsl2&package=pip
Versions
(ml) techpowerb@ruby:~$ python collect_env.py Collecting environment information... PyTorch version: 2.1.0.post3+cxx11.abi PyTorch CXX11 ABI: Yes IPEX version: 2.1.40+xpu IPEX commit: 80ed47655 Build type: Release
OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: N/A IGC version: 2024.2.1 (2024.2.1.20240711) CMake version: N/A Libc version: glibc-2.35
Python version: 3.9.19 (main, May 6 2024, 19:43:03) [GCC 11.2.0] (64-bit runtime) Python platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Is XPU available: False DPCPP runtime version: 2024.2 MKL version: 2024.2 GPU models and configuration:
Intel OpenCL ICD version: 23.17.26241.33-647~22.04 Level Zero version: 1.3.26241.33-647~22.04
CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz CPU family: 6 Model: 140 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 1 BogoMIPS: 4838.39 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear flush_l1d arch_capabilities Virtualization: VT-x Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 192 KiB (4 instances) L1i cache: 128 KiB (4 instances) L2 cache: 5 MiB (4 instances) L3 cache: 8 MiB (1 instance) Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Mitigation; Enhanced IBRS Vulnerability Spec rstack overflow: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected
Versions of relevant libraries: [pip3] intel_extension_for_pytorch==2.1.40+xpu [pip3] numpy==1.26.4 [pip3] torch==2.1.0.post3+cxx11.abi [pip3] torchaudio==2.1.0.post3+cxx11.abi [pip3] torchvision==0.16.0.post3+cxx11.abi [conda] intel-extension-for-pytorch 2.1.40+xpu pypi_0 pypi [conda] numpy 1.26.4 pypi_0 pypi [conda] torch 2.1.0.post3+cxx11.abi pypi_0 pypi [conda] torchaudio 2.1.0.post3+cxx11.abi pypi_0 pypi [conda] torchvision 0.16.0.post3+cxx11.abi pypi_0 pypi``