intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Apache License 2.0
1.59k stars 244 forks source link

Slow initialization of ipex v2.3.110 on Windows 11 #725

Open 0Pinky0 opened 2 days ago

0Pinky0 commented 2 days ago

Describe the bug

I am trying to use ipex on Windows11. At the first time running python scripts, it took quite a long time to go through each operation (create tensors on xpu, get tensors by index, compute gradients), and the same process will become much faster when being executed again. However recently I am experiencing the same problem after few days not using ipex, I had to wait for this slow initialization for almost everyday. I am going to install ipex on a remote Windows device and let it execute my code to get trained model in limited hours, so it will be lethal if such problem also happen on the remote device. I found a similar issue https://github.com/intel/intel-extension-for-pytorch/issues/721 which was solved by building ipex from sources, but in my situation ipex can only be installed through pip. Is there any solution to solve this problem for ipex installed by pip on Windows?

Versions

PyTorch version: 2.3.1+cxx11.abi PyTorch CXX11 ABI: No IPEX version: 2.3.110+xpu IPEX commit: 95c945927 Build type: Release

OS: Microsoft Windows 11 家庭中文版 GCC version: N/A Clang version: N/A IGC version: N/A CMake version: N/A Libc version: N/A

Python version: 3.8.20 (default, Oct 3 2024, 15:19:54) [MSC v.1929 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.22631-SP0 Is XPU available: True DPCPP runtime version: N/A MKL version: N/A GPU models and configuration: [0] _XpuDeviceProperties(name='Intel(R) Arc(TM) Graphics', platform_name='Intel(R) Level-Zero', type='gpu', driver_version='1.3.28328', total_memory=30115MB, max_compute_units=128, gpu_eu_count=128, gpu_subslice_count=16, max_work_group_size=1024, max_num_sub_groups=128, sub_group_sizes=[8 16 32], has_fp16=1, has_fp64=1, has_atomic64=1) Intel OpenCL ICD version: N/A Level Zero version: N/A

CPU: Architecture=9 CurrentClockSpeed=1400 DeviceID=CPU0 Family=774 L2CacheSize=18432 L2CacheSpeed= Manufacturer=GenuineIntel MaxClockSpeed=1400 Name=Intel(R) Core(TM) Ultra 7 155H ProcessorType=3 Revision=

Versions of relevant libraries: [pip3] intel_extension_for_pytorch==2.3.110+xpu [pip3] numpy==1.24.4 [pip3] torch==2.3.1+cxx11.abi [conda] intel-extension-for-pytorch 2.3.110+xpu pypi_0 pypi [conda] mkl 2024.2.1 pypi_0 pypi [conda] mkl-dpcpp 2024.2.1 pypi_0 pypi [conda] numpy 1.24.4 pypi_0 pypi [conda] onemkl-sycl-blas 2024.2.1 pypi_0 pypi [conda] onemkl-sycl-datafitting 2024.2.1 pypi_0 pypi [conda] onemkl-sycl-dft 2024.2.1 pypi_0 pypi [conda] onemkl-sycl-lapack 2024.2.1 pypi_0 pypi [conda] onemkl-sycl-rng 2024.2.1 pypi_0 pypi [conda] onemkl-sycl-sparse 2024.2.1 pypi_0 pypi [conda] onemkl-sycl-stats 2024.2.1 pypi_0 pypi [conda] onemkl-sycl-vm 2024.2.1 pypi_0 pypi [conda] torch 2.3.1+cxx11.abi pypi_0 pypi

iori2333 commented 2 days ago

Maybe you can try PyTorch 2.5 that supports Ultra 7 155H, and nightly build binaries for Windows are provided as well.

0Pinky0 commented 2 days ago

Maybe you can try PyTorch 2.5 that supports Ultra 7 155H, and nightly build binaries for Windows are provided as well.

Thanks for advice! Though Pytorch 2.5 won't support python 3.8 anymore lol

wangkl2 commented 1 day ago

@0Pinky0 It spends long time for JIT compilation if the binary does not match the AOT backend target for the XPU devices. For IPEX windows 2.3.110, we've provided specific pypi wheels for different devices.

From your output, you are using the Arc iGPU integrated in MTL-H, please use the following installation command instead:

# For Intel® Core™ Ultra Processors with Intel® Core™ Ultra Processors with Intel® Arc™ Graphics (MTL-H), use the commands below:
conda install libuv
python -m pip install torch==2.3.1+cxx11.abi torchvision==0.18.1+cxx11.abi torchaudio==2.3.1+cxx11.abi intel-extension-for-pytorch==2.3.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/mtl/us/
0Pinky0 commented 22 hours ago

@0Pinky0 It spends long time for JIT compilation if the binary does not match the AOT backend target for the XPU devices. For IPEX windows 2.3.110, we've provided specific pypi wheels for different devices.

From your output, you are using the Arc iGPU integrated in MTL-H, please use the following installation command instead:

# For Intel® Core™ Ultra Processors with Intel® Core™ Ultra Processors with Intel® Arc™ Graphics (MTL-H), use the commands below:
conda install libuv
python -m pip install torch==2.3.1+cxx11.abi torchvision==0.18.1+cxx11.abi torchaudio==2.3.1+cxx11.abi intel-extension-for-pytorch==2.3.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/mtl/us/

Thanks for help. However there's still a troublesome issue preventing me from using ipex: https://github.com/intel/intel-extension-for-pytorch/issues/710 https://github.com/intel/intel-extension-for-pytorch/issues/717

wangkl2 commented 21 hours ago

@0Pinky0 The dev team is WIP looking into the import issues on python3.8, 3.9 and 3.11. Would you mind using python3.10 on Windows if possible, and please stay tuned.