intel / intel-extension-for-tensorflow

Intel® Extension for TensorFlow*
Other
311 stars 39 forks source link

intel-extension-for-tensorflow cannot utilize Intel GPU #59

Closed SamanwaySadhu closed 6 months ago

SamanwaySadhu commented 7 months ago

I'm using a Device Name: Intel(R) Data Center GPU Max 1550 to run some workloads and I used the following commands to install the intel-extension-for-tensorflow: pip install tensorflow==2.14 intel-extension-for-tensorflow[xpu]==2.14 intel-optimization-for-horovod==0.28.1.1

Already verified all the GPU drivers are installed properly. However it doesn't utilize the GPU and Running the following command: python -c "import intel_extension_for_tensorflow as itex; print(itex.__version__)" gets the below output:

2024-01-26 11:08:48.005514: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2024-01-26 11:08:48.007204: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-01-26 11:08:48.032748: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-01-26 11:08:48.032769: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-01-26 11:08:48.032793: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-01-26 11:08:48.037812: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-01-26 11:08:48.037954: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-01-26 11:08:48.562654: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-01-26 11:08:48.829211: W itex/core/wrapper/itex_gpu_wrapper.cc:32] Could not load dynamic library: libmkl_sycl_blas.so.4: cannot open shared object file: No such file or directory 2024-01-26 11:08:48.863089: I itex/core/wrapper/itex_cpu_wrapper.cc:42] Intel Extension for Tensorflow AVX512 CPU backend is loaded. 2024-01-26 11:08:48.895381: W itex/core/ops/op_init.cc:58] Op: _QuantizedMaxPool3D is already registered in Tensorflow 2024-01-26 11:08:48.904827: E itex/core/wrapper/itex_gpu_wrapper.cc:49] Could not load Intel Extension for Tensorflow GPU backend, GPU will not be used. If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues 2024-01-26 11:08:48.904961: E itex/core/wrapper/itex_gpu_wrapper.cc:49] Could not load Intel Extension for Tensorflow* GPU backend, GPU will not be used. If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues 2.14.0.0

guizili0 commented 7 months ago

This issue is you did not activate onemkl, you need setup you environment for intel oneAPI. For more detail, you can check: https://github.com/intel/intel-extension-for-tensorflow/blob/main/docs/install/install_for_xpu.md#install-oneapi-base-toolkit-packages

2024-01-26 11:08:48.829211: W itex/core/wrapper/itex_gpu_wrapper.cc:32] Could not load dynamic library: libmkl_sycl_blas.so.4: cannot open shared object file: No such file or directory

djsv23 commented 7 months ago

I am experiencing the same issue, running an ARC A750 on Ubuntu 22.04. I've followed the instructions to ensure onemlk is activated, and running env_check.sh still gives the error about not finding cuda drivers:

` Check Environment for Intel(R) Extension for TensorFlow*...

======================== Check Python ========================

python3.9 is installed.

==================== Check Python Passed =====================

========================== Check OS ==========================

OS ubuntu:22.04 is Supported.

====================== Check OS Passed =======================

====================== Check Tensorflow ======================

Tensorflow2.14 is installed.

================== Check Tensorflow Passed ===================

=================== Check Intel GPU Driver ===================

Intel(R) graphics runtime intel-level-zero-gpu-1.3.27191.42-775 is installed, but is not recommended . Intel(R) graphics runtime intel-opencl-icd-23.35.27191.42-775 is installed, but is not recommended . Intel(R) graphics runtime level-zero-1.14.0-744 is installed, but is not recommended . Intel(R) graphics runtime libigc1-1.0.15136.24-775 is installed, but is not recommended . Intel(R) graphics runtime libigdfcl1-1.0.15136.24-775 is installed, but is not recommended . Intel(R) graphics runtime libigdgmm12-22.3.12-742 is installed, but is not recommended .

=============== Check Intel GPU Driver Finshed ================

===================== Check Intel oneAPI =====================

Intel(R) oneAPI DPC++/C++ Compiler is installed. Intel(R) oneAPI Math Kernel Library is installed.

================= Check Intel oneAPI Passed ==================

========================== Check Devices Availability ==========================

2024-02-01 10:55:06.186663: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2024-02-01 10:55:06.188042: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-02-01 10:55:06.206690: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-02-01 10:55:06.206709: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-02-01 10:55:06.206733: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-02-01 10:55:06.211135: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-02-01 10:55:06.211256: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-02-01 10:55:06.688400: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-02-01 10:55:06.965341: I itex/core/wrapper/itex_cpu_wrapper.cc:60] Intel Extension for Tensorflow AVX512 CPU backend is loaded. 2024-02-01 10:55:08.016040: I itex/core/wrapper/itex_gpu_wrapper.cc:35] Intel Extension for Tensorflow GPU backend is loaded. 2024-02-01 10:55:08.090198: I itex/core/devices/gpu/itex_gpu_runtime.cc:129] Selected platform: Intel(R) Level-Zero 2024-02-01 10:55:08.090492: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device. 2024-02-01 10:55:08.563296: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

====================== Check Devices Availability Passed ======================= `

srinarayan-srikanthan commented 7 months ago

Hi @SamanwaySadhu , Can you try using the driver suggested here please: https://github.com/intel/intel-extension-for-tensorflow/blob/main/docs/install/install_for_xpu.md#install-gpu-drivers

srinarayan-srikanthan commented 6 months ago

Were you able to get it to work? @SamanwaySadhu

SamanwaySadhu commented 6 months ago

Were you able to get it to work? @SamanwaySadhu

@srinarayan-srikanthan it is working now. Thank you very much.