Closed psmgeelen closed 1 year ago
Maybe a relevant side-question could be: how can I validate that I am actually offloading to the GPU?
You can use verbose mode to see which device was used: https://intel.github.io/scikit-learn-intelex/verbose.html
@Alexsandruss, awesome! So I can confirm that it is running on CPU, regardless whether I set_config(target_offload = "gpu:0")
or not. The logging returns:
SKLEARNEX INFO: sklearn.utils.validation._assert_all_finite: running accelerated version on CPU
I am thinking aloud here, could this be a precision issue of the data itself? That I am using a precision that is not compatible with GPU and that it therefore falls back on the CPU?
@psmgeelen you can try other algorithms meanwhile. DBSCAN have some specifics that put it apart
@psmgeelen could you please list your conda env as well? It would be very useful for reproducing
@napetrov, thanks for responding. So I am trying to do some benchmarking with Intel GPU and followed the compatability list in the documentation here https://intel.github.io/scikit-learn-intelex/algorithms.html; So the algorithms that I have tried running are:
@samir-nasibli , I listed my environment using conda list
and it returned:
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
bzip2 1.0.8 hb9a14ef_9 intel
ca-certificates 2023.01.10 h06a4308_0 intel
certifi 2022.12.7 py39h06a4308_0 intel
daal4py 2023.1.1 py39_intel_48679 intel
dal 2023.1.1 intel_48679 intel
dpcpp-cpp-rt 2023.1.0 intel_46305 intel
dpcpp_cpp_rt 2023.1.0 intel_46305 intel
dpctl 0.14.2 py39ha23a21d_9 intel
fortran_rt 2023.1.0 intel_46305 intel
glob2 0.7 py_0 conda-forge
icc_rt 2023.1.0 intel_46305 intel
impi_rt 2021.9.0 intel_43482 intel
intel-cmplr-lib-rt 2023.1.0 intel_46305 intel
intel-cmplr-lic-rt 2023.1.0 intel_46305 intel
intel-fortran-rt 2023.1.0 intel_46305 intel
intel-opencl-rt 2023.1.0 intel_46305 intel
intel-openmp 2023.1.0 intel_46305 intel
intelpython 2023.1.0 1 intel
joblib 1.2.0 pyh3f38642_0 intel
libffi 3.3 14 intel
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
mkl 2023.1.0 intel_46342 intel
mkl-service 2.4.0 py39h75d02e3_15 intel
mkl_fft 1.3.5 py39h28f0b46_11 intel
mkl_random 1.2.2 py39h0b06908_51 intel
mkl_umath 0.1.1 py39h450dca2_61 intel
ncurses 6.4 h6a678d5_0 intel
numpy 1.23.5 py39h52df89b_7 intel
numpy-base 1.23.5 py39ha03f565_7 intel
openssl 1.1.1t h7f8727e_0 intel
pandas 1.5.3 py39h6cd0baa_0 intel
pip 23.0.1 py39h06a4308_0 intel
python 3.9.16 h2722d68_1 intel
python-dateutil 2.8.2 py39_2 intel
pytz 2022.7 py39h06a4308_0 intel
readline 8.2 h5eee18b_0 intel
scikit-learn 1.2.1 py39h6a678d5_0 intel
scikit-learn-intelex 2023.1.1 py39_intel_48679 intel
scipy 1.7.3 py39h4ca98da_8 intel
setuptools 65.6.3 py39h06a4308_0 intel
six 1.16.0 pyhd3eb1b0_1 intel
sqlite 3.41.1 h5eee18b_0 intel
tbb 2021.9.0 intel_43484 intel
tbb4py 2021.9.0 py39_intel_43484 intel
threadpoolctl 2.2.0 pyh0d69192_0 intel
tk 8.6.12 h1ccaba5_0 intel
tqdm 4.64.0 py39h06a4308_0 intel
wheel 0.38.4 py39h06a4308_0 intel
xz 5.2.8 h5eee18b_0 intel
zlib 1.2.13 h5eee18b_0 intel
Thank you @psmgeelen ! Could you please also share what system platforms dpctl returns python-m dpctl -f
and ls -al $OCL_ICD_VENDORS
?
DBSCAN have some implementation specifics, it requires OpenCL loader on env installed.
I see that you already have intel-opencl-rt
, but there is an issue https://github.com/IntelPython/dpctl/issues/1006 and OCL_ICD_VENDORS should pointed for gpu
Also please update your conda env via: conda update -c intel -c conda-forge --all daal4py, scikit-learn-intelex 2023.2 already are available
Hi @samir-nasibli, python -m dpctl -f
returned:
Platform 0 ::
Name Intel(R) OpenCL
Version OpenCL 3.0 LINUX
Vendor Intel(R) Corporation
Backend opencl
Num Devices 1
# 0
Name AMD Ryzen 9 5950X 16-Core Processor
Version 2023.16.6.0.22_223734
Filter string opencl:cpu:0
Platform 1 ::
Name Intel(R) FPGA Emulation Platform for OpenCL(TM)
Version OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
Vendor Intel(R) Corporation
Backend opencl
Num Devices 1
# 0
Name Intel(R) FPGA Emulation Device
Version 2023.16.6.0.22_223734
Filter string opencl:accelerator:0
Platform 2 ::
Name Intel(R) OpenCL Graphics
Version OpenCL 3.0
Vendor Intel(R) Corporation
Backend opencl
Num Devices 1
# 0
Name Intel(R) Arc(TM) A770 Graphics
Version 23.17.26241.33
Filter string opencl:gpu:0
Platform 3 ::
Name Intel(R) FPGA Emulation Platform for OpenCL(TM)
Version OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
Vendor Intel(R) Corporation
Backend opencl
Num Devices 1
# 0
Name Intel(R) FPGA Emulation Device
Version 2023.16.6.0.22_223734
Filter string opencl:accelerator:1
Platform 4 ::
Name Intel(R) Level-Zero
Version 1.3
Vendor Intel(R) Corporation
Backend ext_oneapi_level_zero
Num Devices 1
# 0
Name Intel(R) Arc(TM) A770 Graphics
Version 1.3.26241
Filter string level_zero:gpu:0
I think I might be doing something wrong when running ls -al $OCL_ICD_VENDORS
as it just lists the current directory..
I have updated the conda environmen running conda update -c intel -c conda-forge --all
and the environment is now:
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge intel
_openmp_mutex 4.5 2_gnu intel
bzip2 1.0.8 hb9a14ef_9 intel
ca-certificates 2023.5.7 hbcca054_0 intel
certifi 2022.12.7 py39h06a4308_0 intel
colorama 0.4.6 pyhd8ed1ab_0 intel
daal4py 2023.2.0 py39_intel_49572 intel
dal 2023.2.0 intel_49572 intel
dpcpp-cpp-rt 2023.2.0 intel_49495 intel
dpcpp_cpp_rt 2023.2.0 intel_49495 intel
dpctl 0.14.5 py39he78b74f_24 intel
fortran_rt 2023.1.0 intel_46305 intel
glob2 0.7 py_0 conda-forge
icc_rt 2023.2.0 intel_49495 intel
impi_rt 2021.9.0 intel_43482 intel
intel-cmplr-lib-rt 2023.2.0 intel_49495 intel
intel-cmplr-lic-rt 2023.2.0 intel_49495 intel
intel-fortran-rt 2023.2.0 intel_49495 intel
intel-opencl-rt 2023.2.0 intel_49495 intel
intel-openmp 2023.2.0 intel_49495 intel
intelpython 2023.2.0 0 intel
joblib 1.2.0 pyh3f38642_0 intel
level-zero 1.11.0 h00ab1b0_0 intel
libffi 3.4.2 h7f98852_5 intel
libgcc-ng 12.2.0 h65d4601_19 intel
libgomp 12.2.0 h65d4601_19 intel
libnsl 2.0.0 h7f98852_0 intel
libsqlite 3.42.0 h2797004_0 intel
libstdcxx-ng 12.2.0 h46fd767_19 intel
libuuid 2.38.1 h0b41bf4_0 intel
libzlib 1.2.13 hd590300_5 intel
mkl 2023.2.0 intel_49495 intel
mkl-service 2.4.0 py39h75d02e3_15 intel
mkl_fft 1.3.6 py39h173b8ae_56 intel
mkl_random 1.2.2 py39h1595b48_76 intel
mkl_umath 0.1.1 py39hd987cd3_86 intel
ncurses 6.4 hcb278e6_0 intel
numpy 1.24.3 py39hed7eef7_0 intel
numpy-base 1.24.3 py39he88ecf9_0 intel
openssl 3.1.1 hd590300_1 intel
pandas 1.5.3 py39h6cd0baa_0 intel
pip 23.1.2 pyhd8ed1ab_0 intel
python 3.9.16 hef7c979_23 intel
python-dateutil 2.8.2 py39_2 intel
pytz 2022.7 py39h06a4308_0 intel
readline 8.2 h8228510_1 intel
scikit-learn 1.2.1 py39h6a678d5_0 intel
scikit-learn-intelex 2023.2.0 py39_intel_49572 intel
scipy 1.7.3 py39h4ca98da_8 intel
setuptools 67.7.2 pyhd8ed1ab_0 intel
six 1.16.0 pyhd3eb1b0_1 intel
sqlite 3.41.1 h5eee18b_0 intel
tbb 2021.9.0 intel_43484 intel
tbb4py 2021.9.0 py39_intel_43484 intel
threadpoolctl 2.2.0 pyh0d69192_0 intel
tk 8.6.12 h1ccaba5_0 intel
tqdm 4.65.0 pyhd8ed1ab_1 intel
tzdata 2023c h71feb2d_0 intel
wheel 0.40.0 pyhd8ed1ab_0 intel
xz 5.2.8 h5eee18b_0 intel
zlib 1.2.13 hd590300_5 intel
I think the updating broke the environment, as I now get this error:
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
I resolved this running: conda install -c intel -c conda-forge numpy=1.22
and now have an environment like this:
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge intel
_openmp_mutex 4.5 2_gnu intel
bzip2 1.0.8 hb9a14ef_9 intel
ca-certificates 2023.5.7 hbcca054_0 intel
certifi 2023.5.7 pyhd8ed1ab_0 intel
colorama 0.4.6 pyhd8ed1ab_0 intel
daal4py 2023.2.0 py39_intel_49572 intel
dal 2023.2.0 intel_49572 intel
dpcpp-cpp-rt 2023.2.0 intel_49495 intel
dpcpp_cpp_rt 2023.2.0 intel_49495 intel
dpctl 0.14.5 py39he78b74f_24 intel
fortran_rt 2023.1.0 intel_46305 intel
glob2 0.7 py_0 conda-forge
icc_rt 2023.2.0 intel_49495 intel
impi_rt 2021.9.0 intel_43482 intel
intel-cmplr-lib-rt 2023.2.0 intel_49495 intel
intel-cmplr-lic-rt 2023.2.0 intel_49495 intel
intel-fortran-rt 2023.2.0 intel_49495 intel
intel-opencl-rt 2023.2.0 intel_49495 intel
intel-openmp 2023.2.0 intel_49495 intel
intelpython 2023.2.0 0 intel
joblib 1.2.0 pyh3f38642_0 intel
level-zero 1.11.0 h00ab1b0_0 intel
libffi 3.4.2 h7f98852_5 intel
libgcc-ng 12.2.0 h65d4601_19 intel
libgomp 12.2.0 h65d4601_19 intel
libnsl 2.0.0 h7f98852_0 intel
libsqlite 3.42.0 h2797004_0 intel
libstdcxx-ng 12.2.0 h46fd767_19 intel
libuuid 2.38.1 h0b41bf4_0 intel
libzlib 1.2.13 hd590300_5 intel
mkl 2023.2.0 intel_49495 intel
mkl-service 2.4.0 py39h75d02e3_15 intel
mkl_fft 1.3.1 py39hcab1719_22 intel
mkl_random 1.2.2 py39hbf47bc3_22 intel
mkl_umath 0.1.1 py39hf66a691_32 intel
ncurses 6.4 hcb278e6_0 intel
numpy 1.22.3 py39hf0956d0_5 intel
numpy-base 1.22.3 py39h45c9ace_5 intel
openssl 3.1.1 hd590300_1 intel
pandas 1.5.3 py39h6cd0baa_0 intel
pip 23.1.2 pyhd8ed1ab_0 intel
python 3.9.16 hef7c979_23 intel
python-dateutil 2.8.2 py39_2 intel
pytz 2022.7 py39h06a4308_0 intel
readline 8.2 h8228510_1 intel
scikit-learn 1.2.1 py39h6a678d5_0 intel
scikit-learn-intelex 2023.2.0 py39_intel_49572 intel
scipy 1.7.3 py39h4ca98da_8 intel
setuptools 67.7.2 pyhd8ed1ab_0 intel
six 1.16.0 pyhd3eb1b0_1 intel
sqlite 3.41.1 h5eee18b_0 intel
tbb 2021.9.0 intel_43484 intel
tbb4py 2021.9.0 py39_intel_43484 intel
threadpoolctl 2.2.0 pyh0d69192_0 intel
tk 8.6.12 h1ccaba5_0 intel
tqdm 4.65.0 pyhd8ed1ab_1 intel
tzdata 2023c h71feb2d_0 intel
wheel 0.40.0 pyhd8ed1ab_0 intel
xz 5.2.8 h5eee18b_0 intel
zlib 1.2.13 hd590300_5 intel
When I run my script I still get: INFO:sklearnex: sklearn.utils.validation._assert_all_finite: running accelerated version on CPU
even though the I offloaded to the GPU.
Is there anything else I can do to support the process?
Hi @psmgeelen! Unfortunately I didn't reproduce your issue. I am getting GPU offloading. Let me investigate it more. I will let you know.
@samir-nasibli , I might be doing something stupid. I ran this:
from sklearnex import patch_sklearn, config_context
import numpy as np
import logging
logger = logging.getLogger('sklearnex')
logger.setLevel(logging.INFO)
patch_sklearn()
from sklearn.cluster import DBSCAN
X = np.array([[1., 2.], [2., 2.], [2., 3.],
[8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with config_context(target_offload="gpu:0"):
clustering = DBSCAN(eps=3, min_samples=2).fit(X)
In my new environment and got this print out:
Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
INFO:sklearnex: sklearn.cluster.DBSCAN.fit: running accelerated version on GPU
So it seems to be working. For some reason, this doesn't work on my larger benchmark test yet. Please give me 24 hours to debug myself before closing this issue. I'll get back to you soon!
Maybe small inbetween question: what does the log:
INFO:sklearnex: sklearn.utils.validation._assert_all_finite: running accelerated version on CPU
exactly mean?
After reinstalling the environment again, the issue is not reproducible anymore. I guess the error was transient. I noticed that the GPU support for Intel is not accurately described in the documentation. I found that:
are supported on an ARC 770, while the so-called supported algorithms for GPUs in the documentation:
Furthermore I had some issues with the methods that are associated to the models. For example the fit_predict
method for DBSCAN threw an error:
RuntimeError: Cannot use target offload option inside daal4py.oneapi.sycl_context
While using the fit
method works just fine.
Regardless, closing the issue. Thanks for the support!
Describe the bug Following the example in the documentation about GPU offloading, I noticed that it did run, but that there was CPU load and that it didnt seem to be using the GPU (didnt hear any fans ramp up or anything). The example is:
I have also tried to more explicitly offload by using the general context
But that didn't change much. I also considered the opportunity to offload the object:
Which prints out nicely:
The best argument I could find that GPU offloading is not working as it should is because:
timeit
to compare 10 runs each time, and they are the same regardless whether I offload to CPU or GPU.To Reproduce Already provided above
Expected behavior Execution time should change when you run it on different hardware
Environment: Ubuntu 23.04 CPU: 16-core AMD Ryzen 9 5950X (-MT MCP-) speed/min/max: 2258/2200/5083 MHz Kernel: 6.2.0-24-generic x86_64 Up: 1h 27m Mem: 7957.1/128724.3 MiB (6.2%) Storage: 931.51 GiB (37.3% used) Procs: 576 Shell: Bash inxi: 3.3.25