Open wswsmao opened 8 months ago
This is my env
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
BIOS Vendor ID: Smdbmds
Model name: Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz
BIOS Model name: 3.0 CPU @ 2.0GHz
BIOS CPU family: 1
CPU family: 6
Model: 94
Thread(s) per core: 1
Core(s) per socket: 16
Socket(s): 1
Stepping: 3
BogoMIPS: 4988.26
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht sysca
ll nx pdpe1gb rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse
4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefe
tch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsave
opt xsavec xgetbv1 arat
I change a new env:
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Vendor ID: GenuineIntel
BIOS Vendor ID: Smdbmds
Model name: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
BIOS Model name: 3.0
CPU family: 6
Model: 85
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
Stepping: 5
BogoMIPS: 4988.28
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1g
b rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe
popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1
hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsav
eopt xsavec xgetbv1 arat avx512_vnni
and same error
# pip list | grep tensorflow
intel-extension-for-tensorflow 1.2.0
intel-extension-for-tensorflow-lib 1.2.0.0
tensorflow 2.12.0
tensorflow-estimator 2.12.0
tensorflow-io-gcs-filesystem 0.36.0
# python -c "import intel_extension_for_tensorflow as itex; print(itex.__version__)"
2024-02-27 16:38:19.340349: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-02-27 16:38:19.478901: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-27 16:38:20.154571: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-27 16:38:20.155380: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-27 16:38:21.379435: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-27 16:38:22.219570: E itex/core/kernels/xpu_kernel.cc:38] XPU-GPU kernel not supported.
If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues
2024-02-27 16:38:22.301632: E itex/core/kernels/xpu_kernel.cc:38] XPU-GPU kernel not supported.
If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues
2024-02-27 16:38:22.302928: F itex/core/utils/op_kernel.cc:54] Check failed: false Multiple KernelCreateFunc registration
If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues
Aborted (core dumped)
Thanks for reporting this. Let me try to reproduce on my end and get back to you
Hello, could you please try the latest 2.14 release intel extension for TensorFlow https://github.com/intel/intel-extension-for-tensorflow/releases/tag/v2.14.0.1? This release is verified on 2ed Gen Xeon scalable processors.
Many thanks!
Hi @YuningQiu , same error
# python -c "import intel_extension_for_tensorflow as itex; print(itex.__version__)"
2024-02-28 01:55:04.519696: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-28 01:55:04.561273: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-28 01:55:04.561313: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-28 01:55:04.561358: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-28 01:55:04.569526: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-28 01:55:04.569760: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-28 01:55:05.598361: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Illegal instruction (core dumped)
update already
# pip list | grep tensorflow
intel-extension-for-tensorflow 2.14.0.1
intel-extension-for-tensorflow-lib 2.14.0.1.0
tensorflow 2.14.1
tensorflow-estimator 2.14.0
tensorflow-io-gcs-filesystem 0.36.0
[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: pip install --upgrade pip
Hello, I am not able to reproduce the issue on my side.
$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 96 On-line CPU(s) list: 0-95 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz Stepping: 7 CPU MHz: 1036.127 CPU max MHz: 3700.0000 CPU min MHz: 1000.0000 BogoMIPS: 4200.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 36608K NUMA node0 CPU(s): 0-23,48-71 NUMA node1 CPU(s): 24-47,72-95 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
$ pip list |grep tensorflow intel-extension-for-tensorflow 2.13.0.1 intel-extension-for-tensorflow-lib 2.13.0.1.0 tensorflow 2.13.0 tensorflow-estimator 2.13.0 tensorflow-io-gcs-filesystem 0.34.0
$ pip list Package Version
absl-py 2.1.0 astunparse 1.6.3 cachetools 5.3.3 certifi 2024.2.2 charset-normalizer 3.3.2 flatbuffers 23.5.26 gast 0.4.0 google-auth 2.28.1 google-auth-oauthlib 1.0.0 google-pasta 0.2.0 grpcio 1.62.0 h5py 3.10.0 idna 3.6 importlib-metadata 7.0.1 intel-extension-for-tensorflow 2.13.0.1 intel-extension-for-tensorflow-lib 2.13.0.1.0 keras 2.13.1 libclang 16.0.6 Markdown 3.5.2 MarkupSafe 2.1.5 numpy 1.23.5 oauthlib 3.2.2 opt-einsum 3.3.0 packaging 23.2 pip 24.0 pkg_resources 0.0.0 protobuf 4.25.3 pyasn1 0.5.1 pyasn1-modules 0.3.0 requests 2.31.0 requests-oauthlib 1.3.1 rsa 4.9 setuptools 69.1.1 six 1.16.0 tensorboard 2.13.0 tensorboard-data-server 0.7.2 tensorflow 2.13.0 tensorflow-estimator 2.13.0 tensorflow-io-gcs-filesystem 0.34.0 termcolor 2.4.0 typing_extensions 4.5.0 urllib3 2.2.1 Werkzeug 3.0.1 wheel 0.42.0 wrapt 1.16.0 zipp 3.17.0
$ python -c "import intel_extension_for_tensorflow as itex; print(itex.version)"
2024-02-29 09:39:12.960106: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0
.
2024-02-29 09:39:13.001971: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-29 09:39:13.660827: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-29 09:39:14.011651: I itex/core/wrapper/itex_cpu_wrapper.cc:42] Intel Extension for Tensorflow* AVX512 CPU backend is loaded.
2024-02-29 09:39:14.049942: W itex/core/ops/op_init.cc:58] Op: _QuantizedMaxPool3D is already registered in Tensorflow
2.13.0.1
Could you please try upgrade your pip version and try "pip install --upgrade intel-extension-for-tensorflow[cpu]" again?
Hello, I am not able to reproduce the issue on my side.
$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 96 On-line CPU(s) list: 0-95 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz Stepping: 7 CPU MHz: 1036.127 CPU max MHz: 3700.0000 CPU min MHz: 1000.0000 BogoMIPS: 4200.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 36608K NUMA node0 CPU(s): 0-23,48-71 NUMA node1 CPU(s): 24-47,72-95 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
$ pip list |grep tensorflow intel-extension-for-tensorflow 2.13.0.1 intel-extension-for-tensorflow-lib 2.13.0.1.0 tensorflow 2.13.0 tensorflow-estimator 2.13.0 tensorflow-io-gcs-filesystem 0.34.0
$ pip list Package Version
absl-py 2.1.0 astunparse 1.6.3 cachetools 5.3.3 certifi 2024.2.2 charset-normalizer 3.3.2 flatbuffers 23.5.26 gast 0.4.0 google-auth 2.28.1 google-auth-oauthlib 1.0.0 google-pasta 0.2.0 grpcio 1.62.0 h5py 3.10.0 idna 3.6 importlib-metadata 7.0.1 intel-extension-for-tensorflow 2.13.0.1 intel-extension-for-tensorflow-lib 2.13.0.1.0 keras 2.13.1 libclang 16.0.6 Markdown 3.5.2 MarkupSafe 2.1.5 numpy 1.23.5 oauthlib 3.2.2 opt-einsum 3.3.0 packaging 23.2 pip 24.0 pkg_resources 0.0.0 protobuf 4.25.3 pyasn1 0.5.1 pyasn1-modules 0.3.0 requests 2.31.0 requests-oauthlib 1.3.1 rsa 4.9 setuptools 69.1.1 six 1.16.0 tensorboard 2.13.0 tensorboard-data-server 0.7.2 tensorflow 2.13.0 tensorflow-estimator 2.13.0 tensorflow-io-gcs-filesystem 0.34.0 termcolor 2.4.0 typing_extensions 4.5.0 urllib3 2.2.1 Werkzeug 3.0.1 wheel 0.42.0 wrapt 1.16.0 zipp 3.17.0
$ python -c "import intel_extension_for_tensorflow as itex; print(itex.version)" 2024-02-29 09:39:12.960106: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable
TF_ENABLE_ONEDNN_OPTS=0
. 2024-02-29 09:39:13.001971: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-02-29 09:39:13.660827: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-02-29 09:39:14.011651: I itex/core/wrapper/itex_cpu_wrapper.cc:42] Intel Extension for Tensorflow* AVX512 CPU backend is loaded. 2024-02-29 09:39:14.049942: W itex/core/ops/op_init.cc:58] Op: _QuantizedMaxPool3D is already registered in Tensorflow 2.13.0.1Could you please try upgrade your pip version and try "pip install --upgrade intel-extension-for-tensorflow[cpu]" again?
Hi @YuningQiu , This is my list
# pip list
Package Version
---------------------------------- ----------
absl-py 2.1.0
astunparse 1.6.3
cachetools 5.3.3
certifi 2024.2.2
charset-normalizer 3.3.2
flatbuffers 23.5.26
gast 0.5.4
google-auth 2.28.1
google-auth-oauthlib 1.0.0
google-pasta 0.2.0
grpcio 1.62.0
h5py 3.10.0
idna 3.6
importlib-metadata 7.0.1
intel-extension-for-tensorflow 2.14.0.2
intel-extension-for-tensorflow-lib 2.14.0.2.0
keras 2.14.0
libclang 16.0.6
Markdown 3.5.2
MarkupSafe 2.1.5
ml-dtypes 0.2.0
numpy 1.24.4
oauthlib 3.2.2
opt-einsum 3.3.0
packaging 23.2
pip 24.0
protobuf 4.23.4
pyasn1 0.5.1
pyasn1-modules 0.3.0
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
setuptools 68.0.0
six 1.16.0
tensorboard 2.14.1
tensorboard-data-server 0.7.2
tensorflow 2.14.1
tensorflow-estimator 2.14.0
tensorflow-io-gcs-filesystem 0.36.0
termcolor 2.4.0
typing_extensions 4.10.0
urllib3 2.2.1
Werkzeug 3.0.1
wheel 0.42.0
wrapt 1.14.1
zipp 3.17.0
maybe instruction does not support ?
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
BIOS Vendor ID: Smdbmds
Model name: Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz
BIOS Model name: 3.0 CPU @ 2.0GHz
BIOS CPU family: 1
CPU family: 6
Model: 94
Thread(s) per core: 1
Core(s) per socket: 16
Socket(s): 1
Stepping: 3
BogoMIPS: 4988.26
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx
pdpe1gb rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2
x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_sin
gle pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 a
rat
Hello, I am not able to reproduce the issue on my side. $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 96 On-line CPU(s) list: 0-95 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz Stepping: 7 CPU MHz: 1036.127 CPU max MHz: 3700.0000 CPU min MHz: 1000.0000 BogoMIPS: 4200.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 36608K NUMA node0 CPU(s): 0-23,48-71 NUMA node1 CPU(s): 24-47,72-95 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear flush_l1d arch_capabilities $ pip list |grep tensorflow intel-extension-for-tensorflow 2.13.0.1 intel-extension-for-tensorflow-lib 2.13.0.1.0 tensorflow 2.13.0 tensorflow-estimator 2.13.0 tensorflow-io-gcs-filesystem 0.34.0 $ pip list Package Version absl-py 2.1.0 astunparse 1.6.3 cachetools 5.3.3 certifi 2024.2.2 charset-normalizer 3.3.2 flatbuffers 23.5.26 gast 0.4.0 google-auth 2.28.1 google-auth-oauthlib 1.0.0 google-pasta 0.2.0 grpcio 1.62.0 h5py 3.10.0 idna 3.6 importlib-metadata 7.0.1 intel-extension-for-tensorflow 2.13.0.1 intel-extension-for-tensorflow-lib 2.13.0.1.0 keras 2.13.1 libclang 16.0.6 Markdown 3.5.2 MarkupSafe 2.1.5 numpy 1.23.5 oauthlib 3.2.2 opt-einsum 3.3.0 packaging 23.2 pip 24.0 pkg_resources 0.0.0 protobuf 4.25.3 pyasn1 0.5.1 pyasn1-modules 0.3.0 requests 2.31.0 requests-oauthlib 1.3.1 rsa 4.9 setuptools 69.1.1 six 1.16.0 tensorboard 2.13.0 tensorboard-data-server 0.7.2 tensorflow 2.13.0 tensorflow-estimator 2.13.0 tensorflow-io-gcs-filesystem 0.34.0 termcolor 2.4.0 typing_extensions 4.5.0 urllib3 2.2.1 Werkzeug 3.0.1 wheel 0.42.0 wrapt 1.16.0 zipp 3.17.0 $ python -c "import intel_extension_for_tensorflow as itex; print(itex.version)" 2024-02-29 09:39:12.960106: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable
TF_ENABLE_ONEDNN_OPTS=0
. 2024-02-29 09:39:13.001971: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-02-29 09:39:13.660827: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-02-29 09:39:14.011651: I itex/core/wrapper/itex_cpu_wrapper.cc:42] Intel Extension for Tensorflow* AVX512 CPU backend is loaded. 2024-02-29 09:39:14.049942: W itex/core/ops/op_init.cc:58] Op: _QuantizedMaxPool3D is already registered in Tensorflow 2.13.0.1 Could you please try upgrade your pip version and try "pip install --upgrade intel-extension-for-tensorflow[cpu]" again?Hi @YuningQiu , This is my list
# pip list Package Version ---------------------------------- ---------- absl-py 2.1.0 astunparse 1.6.3 cachetools 5.3.3 certifi 2024.2.2 charset-normalizer 3.3.2 flatbuffers 23.5.26 gast 0.5.4 google-auth 2.28.1 google-auth-oauthlib 1.0.0 google-pasta 0.2.0 grpcio 1.62.0 h5py 3.10.0 idna 3.6 importlib-metadata 7.0.1 intel-extension-for-tensorflow 2.14.0.2 intel-extension-for-tensorflow-lib 2.14.0.2.0 keras 2.14.0 libclang 16.0.6 Markdown 3.5.2 MarkupSafe 2.1.5 ml-dtypes 0.2.0 numpy 1.24.4 oauthlib 3.2.2 opt-einsum 3.3.0 packaging 23.2 pip 24.0 protobuf 4.23.4 pyasn1 0.5.1 pyasn1-modules 0.3.0 requests 2.31.0 requests-oauthlib 1.3.1 rsa 4.9 setuptools 68.0.0 six 1.16.0 tensorboard 2.14.1 tensorboard-data-server 0.7.2 tensorflow 2.14.1 tensorflow-estimator 2.14.0 tensorflow-io-gcs-filesystem 0.36.0 termcolor 2.4.0 typing_extensions 4.10.0 urllib3 2.2.1 Werkzeug 3.0.1 wheel 0.42.0 wrapt 1.14.1 zipp 3.17.0
maybe instruction does not support ?
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel BIOS Vendor ID: Smdbmds Model name: Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz BIOS Model name: 3.0 CPU @ 2.0GHz BIOS CPU family: 1 CPU family: 6 Model: 94 Thread(s) per core: 1 Core(s) per socket: 16 Socket(s): 1 Stepping: 3 BogoMIPS: 4988.26 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_sin gle pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 a rat
@YuningQiu , actually, same err in my new env :
absl-py 2.1.0
astunparse 1.6.3
cachetools 5.3.3
certifi 2024.2.2
charset-normalizer 3.3.2
flatbuffers 23.5.26
gast 0.4.0
google-auth 2.28.1
google-auth-oauthlib 1.0.0
google-pasta 0.2.0
grpcio 1.62.0
h5py 3.10.0
idna 3.6
importlib-metadata 7.0.1
intel-extension-for-tensorflow 2.13.0.1
intel-extension-for-tensorflow-lib 2.13.0.1.0
keras 2.13.1
libclang 16.0.6
Markdown 3.5.2
MarkupSafe 2.1.5
numpy 1.23.5
oauthlib 3.2.2
opt-einsum 3.3.0
packaging 23.2
pip 24.0
protobuf 4.25.3
pyasn1 0.5.1
pyasn1-modules 0.3.0
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
setuptools 53.0.0
six 1.16.0
tensorboard 2.13.0
tensorboard-data-server 0.7.2
tensorflow 2.13.0
tensorflow-estimator 2.13.0
tensorflow-io-gcs-filesystem 0.36.0
termcolor 2.4.0
typing_extensions 4.5.0
urllib3 2.2.1
Werkzeug 3.0.1
wheel 0.42.0
wrapt 1.16.0
zipp 3.17.0
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Vendor ID: GenuineIntel
BIOS Vendor ID: Smdbmds
Model name: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
BIOS Model name: 3.0
CPU family: 6
Model: 85
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
Stepping: 5
BogoMIPS: 4988.28
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx
pdpe1gb rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2
x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_sin
gle pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx
512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat avx512_vnni
this is error
# python -c "import intel_extension_for_tensorflow as itex; print(itex.version)"
2024-02-29 10:19:32.352107: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-02-29 10:19:32.457549: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-29 10:19:33.124946: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-29 10:19:33.125560: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-29 10:19:34.277521: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-29 10:19:35.751851: I itex/core/wrapper/itex_cpu_wrapper.cc:42] Intel Extension for Tensorflow* AVX512 CPU backend is loaded.
2024-02-29 10:19:35.809138: W itex/core/ops/op_init.cc:58] Op: _QuantizedMaxPool3D is already registered in Tensorflow
2024-02-29 10:19:35.834854: F itex/core/utils/op_kernel.cc:54] Check failed: false Multiple KernelCreateFunc registration
If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues
Aborted (core dumped)
Hello, I think SKU 6133 CPUs are not supported. Latest ITEX should support starting from 2nd GEN.
Could you please collect the gdb back trace using this command so that we can get more information? $ gdb --args python -c "import intel_extension_for_tensorflow as itex; print(itex.version)"
Hello, I think SKU 6133 CPUs are not supported. Latest ITEX should support starting from 2nd GEN.
Could you please collect the gdb back trace using this command so that we can get more information? $ gdb --args python -c "import intel_extension_for_tensorflow as itex; print(itex.version)"
This is back trace
Thread 1 "python" received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-4.ocs23.x86_64 libb2-0.98.1-2.ocs23.x86_64 libffi-3.4.4-2.ocs23.x86_64 libgcc-12.3.1-2.ocs23.x86_64 libgomp-12.3.1-2.ocs23.x86_64 libstdc++-12.3.1-2.ocs23.x86_64 mpdecimal-2.5.1-4.ocs23.x86_64 openssl-libs-3.0.12-2.ocs23.x86_64 python3-libs-3.11.6-2.ocs23.x86_64 xz-libs-5.4.4-1.ocs23.x86_64 zlib-1.2.13-4.ocs23.x86_64
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007ffff708cff3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007ffff703da26 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007ffff702687c in __GI_abort () at abort.c:79
#4 0x00007fff89b11567 in itex::internal::LogMessageFatal::~LogMessageFatal() ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow-plugins/../intel_extension_for_tensorflow/libitex_cpu_internal_avx2.so
#5 0x00007fff89b28a77 in itex::OpTypeFactory::RegisterOpType(void* (*)(TF_OpKernelConstruction*), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [clone .cold] ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow-plugins/../intel_extension_for_tensorflow/libitex_cpu_internal_avx2.so
#6 0x00007fff89b2d280 in itex::Name::Build(char const*, char const*) ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow-plugins/../intel_extension_for_tensorflow/libitex_cpu_internal_avx2.so
#7 0x00007fff889309aa in itex::Register1(char const*, char const*) ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow-plugins/../intel_extension_for_tensorflow/libitex_cpu_internal_avx2.so
#8 0x00007fff89b25819 in itex::register_kernel::RegisterCPUKernels(char const*) ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow-plugins/../intel_extension_for_tensorflow/libitex_cpu_internal_avx2.so
#9 0x00007fff8892ffbb in TF_InitKernel_Internal ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow-plugins/../intel_extension_for_tensorflow/libitex_cpu_internal_avx2.so
#10 0x00007fff9274875d in TF_InitKernel ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow-plugins/libitex_cpu.so
#11 0x00007ffff24cc016 in tensorflow::RegisterPluggableDevicePlugin(void*) ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2
#12 0x00007ffff24c5fbc in TF_LoadPluggableDeviceLibrary ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2
#13 0x00007fff9959617d in pybind11::cpp_function::initialize<pybind11_init__pywrap_tf_session(pybind11::module_&)::$_66, TF_Library*, char const*, pybind11::name, pybind11::scope, pybind11::sibling, pybind11::return_value_policy>(pybind11_init__pywrap_tf_session(pybind11::module_&)::$_66&&, TF_Library* (*)(char const*), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(pybind11::detail::function_call&)#1}::__invoke(pybind11::detail::function_call&) ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow/python/client/_pywrap_tf_session.so
#14 0x00007fff9955ad58 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) ()
from /home/tensorflow/v_tensorflow/lib64/python3.11/site-packages/tensorflow/python/client/_pywrap_tf_session.so
#15 0x00007ffff75d51f1 in cfunction_call () from /lib64/libpython3.11.so.1.0
#16 0x00007ffff75b7713 in _PyObject_MakeTpCall () from /lib64/libpython3.11.so.1.0
#17 0x00007ffff75c0217 in _PyEval_EvalFrameDefault () from /lib64/libpython3.11.so.1.0
#18 0x00007ffff75bc1aa in _PyEval_Vector () from /lib64/libpython3.11.so.1.0
#19 0x00007ffff7645e16 in PyEval_EvalCode () from /lib64/libpython3.11.so.1.0
#20 0x00007ffff765cdd2 in builtin_exec () from /lib64/libpython3.11.so.1.0
#21 0x00007ffff75cdffa in cfunction_vectorcall_FASTCALL_KEYWORDS () from /lib64/libpython3.11.so.1.0
#22 0x00007ffff75c4747 in _PyEval_EvalFrameDefault () from /lib64/libpython3.11.so.1.0
#23 0x00007ffff75bc1aa in _PyEval_Vector () from /lib64/libpython3.11.so.1.0
#24 0x00007ffff75d49b6 in object_vacall () from /lib64/libpython3.11.so.1.0
#25 0x00007ffff75f9a44 in PyObject_CallMethodObjArgs () from /lib64/libpython3.11.so.1.0
#26 0x00007ffff75f81dc in PyImport_ImportModuleLevelObject () from /lib64/libpython3.11.so.1.0
--Type <RET> for more, q to quit, c to continue without paging--c
#27 0x00007ffff75c59de in _PyEval_EvalFrameDefault () from /lib64/libpython3.11.so.1.0
#28 0x00007ffff75bc1aa in _PyEval_Vector () from /lib64/libpython3.11.so.1.0
#29 0x00007ffff7645e16 in PyEval_EvalCode () from /lib64/libpython3.11.so.1.0
#30 0x00007ffff765cdd2 in builtin_exec () from /lib64/libpython3.11.so.1.0
#31 0x00007ffff75cdffa in cfunction_vectorcall_FASTCALL_KEYWORDS () from /lib64/libpython3.11.so.1.0
#32 0x00007ffff75c4747 in _PyEval_EvalFrameDefault () from /lib64/libpython3.11.so.1.0
#33 0x00007ffff75bc1aa in _PyEval_Vector () from /lib64/libpython3.11.so.1.0
#34 0x00007ffff75d49b6 in object_vacall () from /lib64/libpython3.11.so.1.0
#35 0x00007ffff75f9a44 in PyObject_CallMethodObjArgs () from /lib64/libpython3.11.so.1.0
#36 0x00007ffff75f81dc in PyImport_ImportModuleLevelObject () from /lib64/libpython3.11.so.1.0
#37 0x00007ffff75c59de in _PyEval_EvalFrameDefault () from /lib64/libpython3.11.so.1.0
#38 0x00007ffff75bc1aa in _PyEval_Vector () from /lib64/libpython3.11.so.1.0
#39 0x00007ffff7645e16 in PyEval_EvalCode () from /lib64/libpython3.11.so.1.0
#40 0x00007ffff7663d33 in run_eval_code_obj () from /lib64/libpython3.11.so.1.0
#41 0x00007ffff76602ba in run_mod () from /lib64/libpython3.11.so.1.0
#42 0x00007ffff7654bcd in PyRun_StringFlags () from /lib64/libpython3.11.so.1.0
#43 0x00007ffff7654920 in PyRun_SimpleStringFlags () from /lib64/libpython3.11.so.1.0
#44 0x00007ffff766f145 in Py_RunMain () from /lib64/libpython3.11.so.1.0
#45 0x00007ffff7635f9b in Py_BytesMain () from /lib64/libpython3.11.so.1.0
#46 0x00007ffff7027f50 in __libc_start_call_main (main=main@entry=0x555555555160 <main>, argc=argc@entry=3,
argv=argv@entry=0x7fffffffe108) at ../sysdeps/nptl/libc_start_call_main.h:58
#47 0x00007ffff7028009 in __libc_start_main_impl (main=0x555555555160 <main>, argc=3, argv=0x7fffffffe108, init=<optimized out>,
fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe0f8) at ../csu/libc-start.c:360
#48 0x0000555555555095 in _start ()
@YuningQiu
@wswsmao if you are using venv, please help to switch to conda.
@wswsmao if you are using venv, please help to switch to conda.
@guizili0 ok, i will try it
@wswsmao if you are using venv, please help to switch to conda.
@guizili0 ok, i will try it
Hi @guizili0, seem like just pip install in doc ? https://github.com/intel/intel-extension-for-tensorflow/blob/main/docs/install/install_for_cpu.md
@wswsmao if you are using venv, please help to switch to conda.
@guizili0 ok, i will try it
Hi @guizili0, seem like just pip install in doc ? https://github.com/intel/intel-extension-for-tensorflow/blob/main/docs/install/install_for_cpu.md
conda cmd
# conda search *intel-extension-for-tensorflow*
Loading channels: done
PackagesNotFoundError: The following packages are not available from current channels:
- *intel-extension-for-tensorflow*
Current channels:
- https://repo.anaconda.com/pkgs/main/linux-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/r/linux-64
- https://repo.anaconda.com/pkgs/r/noarch
To search for alternate channels that may provide the conda package you're
looking for, navigate to
https://anaconda.org
and use the search bar at the top of the page.
@wswsmao Sorry for the confuse, I mean you can cerate env via conda and then use pip to install.
@wswsmao Sorry for the confuse, I mean you can cerate env via conda and then use pip to install. Hi @guizili0 , it works. it is time to update the docs
(itex_build) # python quick_example.py 2024-03-04 16:20:47.462221: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2024-03-04 16:20:47.464529: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-03-04 16:20:47.505677: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-03-04 16:20:47.505728: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-03-04 16:20:47.505787: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-03-04 16:20:47.513488: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-03-04 16:20:47.513739: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-03-04 16:20:48.300304: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-03-04 16:20:48.661969: I itex/core/wrapper/itex_cpu_wrapper.cc:60] Intel Extension for Tensorflow* AVX512 CPU backend is loaded. tf.Tensor( [[[[2.9566352 2.5960028 2.6510603] [2.5197754 2.3684092 2.145223 ] [3.125528 2.7386546 3.5465343] [3.4885104 3.9240358 4.0364857] [1.476208 1.5018826 1.921063 ]]
[[2.8954098 2.9481623 3.6797826] [3.3527603 2.776656 3.0839703] [3.4763882 2.7880876 2.5138347] [3.317111 3.3032947 2.9439278] [2.3710513 2.5041685 2.1902466]]
[[3.716969 3.6650152 2.9369717] [2.72032 2.8194175 2.781646 ] [2.4257572 2.911467 2.8563507] [2.8951228 2.3830342 3.1627011] [2.3537884 3.017113 2.3408718]]
[[3.6247163 4.0131707 4.250465 ] [3.353158 2.9245052 3.3258662] [3.866764 3.136556 3.1926696] [3.713012 3.3164258 2.899124 ] [1.745955 2.5850582 2.0824847]]
[[2.2759743 2.7298818 1.8404391] [1.7627361 2.185912 1.516212 ] [1.5935009 2.1497884 1.6038413] [1.689346 1.5855561 1.5915074] [1.0986044 1.2527758 1.0862353]]]], shape=(1, 5, 5, 3), dtype=float32) Finished
@wswsmao In our validation, we also test venv, but did not reproduce this issue. below is my dockerfile to reproduce this issue, but failed. Can you help to share your reproduce step? Thanks.
FROM ubuntu:22.04
ARG DEBIAN_FRONTEND=noninteractive
HEALTHCHECK NONE
RUN ln -sf bash /bin/sh
RUN apt-get update && \
apt-get install -y --no-install-recommends --fix-missing \
wget \
apt-utils \
ca-certificates \
git \
vim \
apt-transport-https curl gnupg \
python-is-python3 \
python3.10-venv \
pip \
gdb \
strace \
gpg && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
RUN mkdir -p /test
RUN python -m venv /test/venv_test
RUN source /test/venv_test/bin/activate
RUN pip install --upgrade intel-extension-for-tensorflow[cpu]
output is:
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2024-03-05 03:21:25.951828: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environ
ment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-05 03:21:25.954911: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-03-05 03:21:26.000753: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-05 03:21:26.000794: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-05 03:21:26.000832: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-03-05 03:21:26.010442: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-03-05 03:21:26.010780: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-05 03:21:27.224707: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-03-05 03:21:27.592314: I itex/core/wrapper/itex_cpu_wrapper.cc:60] Intel Extension for Tensorflow* AVX512 CPU backend is loaded.
>>>
ITEX @guizili0 OK,this is my step. conda can work in the same env
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel BIOS Vendor ID: Smdbmds Model name: Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz BIOS Model name: 3.0 CPU @ 2.0GHz BIOS CPU family: 1 CPU family: 6 Model: 94 Thread(s) per core: 1 Core(s) per socket: 16 Socket(s): 1 Stepping: 3 BogoMIPS: 4988.26 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma c x16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf _lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed a dx smap clflushopt xsaveopt xsavec xgetbv1 arat Virtualization features: Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 512 KiB (16 instances) L1i: 512 KiB (16 instances) L2: 64 MiB (16 instances) L3: 27.5 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerabilities: Itlb multihit: KVM: Mitigation: VMX unsupported L1tf: Mitigation; PTE Inversion Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Meltdown: Mitigation; PTI Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Retbleed: Vulnerable Spec store bypass: Vulnerable Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling Srbds: Unknown: Dependent on hypervisor status Tsx async abort: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown # uname -a Linux VM-33-248-tlinux 5.18.15-2207.2.0.ocks #1 SMP PREEMPT_DYNAMIC Wed Nov 9 11:41:31 CST 2022 x86_64 GNU/Linux # cat /etc/os-release NAME="OpenCloudOS Stream" VERSION="2301" ID="opencloudos" ID_LIKE="opencloudos" VERSION_ID="2301" PLATFORM_ID="platform:ocs2301" PRETTY_NAME="OpenCloudOS Stream 2301" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:opencloudos:opencloudos:2301" HOME_URL="https://www.opencloudos.org/" BUG_REPORT_URL="https://bugs.opencloudos.tech/"
(itex) # pip install --upgrade intel-extension-for-tensorflow[cpu] (itex) # pip list | grep tensorflow intel-extension-for-tensorflow 2.14.0.2 intel-extension-for-tensorflow-lib 2.14.0.2.0 tensorflow 2.14.1 tensorflow-estimator 2.14.0 tensorflow-io-gcs-filesystem 0.36.0
[notice] A new release of pip is available: 23.3.1 -> 24.0 [notice] To update, run: pip install --upgrade pip
(itex) # python -c "import intel_extension_for_tensorflow as itex; print(itex.version)" 2024-03-05 06:28:43.384035: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-03-05 06:28:43.424872: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-03-05 06:28:43.424910: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-03-05 06:28:43.424949: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-03-05 06:28:43.432727: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2024-03-05 06:28:43.432951: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-03-05 06:28:44.449451: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-03-05 06:28:44.891640: I itex/core/wrapper/itex_cpu_wrapper.cc:70] Intel Extension for Tensorflow* AVX2 CPU backend is loaded. 2024-03-05 06:28:44.954756: F itex/core/utils/op_kernel.cc:54] Check failed: false Multiple KernelCreateFunc registration If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues Aborted (core dumped) (itex)
Hi @guizili0, I have run this demo in conda https://github.com/intel/intel-extension-for-tensorflow/blob/main/examples/train_bert/README.md
there are many confuses:
2024-03-04 16:20:48.661969: I itex/core/wrapper/itex_cpu_wrapper.cc:60] Intel Extension for Tensorflow* AVX512 CPU backend is loaded.
@wswsmao
This demo can run without changing the original code, how to prove that ITEX is effective? You can check the log. There is ITEX log when it is enabled.
2024-03-08 11:44:24.621184: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-08 11:44:25.439443: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-03-08 11:44:25.830716: I itex/core/wrapper/itex_cpu_wrapper.cc:60] Intel Extension for Tensorflow AVX512 CPU backend is loaded.
2024-03-08 11:44:26.371964: I itex/core/wrapper/itex_gpu_wrapper.cc:35] Intel Extension for Tensorflow GPU backend is loaded.
2024-03-08 11:44:26.492596: I itex/core/devices/gpu/itex_gpu_runtime.cc:129] Selected platform: Intel(R) Level-Zero
2024-03-08 11:44:26.492973: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device.
2024-03-08 11:44:26.492984: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device.
Tensorflow version 2.14.1
2024-03-08 11:44:27.254820: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform XPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-03-08 11:44:27.254869: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform XPU ID 1, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-03-08 11:44:27.254900: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: XPU, pci bus id:
- how can I get performance improvement data compared with no ITEX? uninstall ITEX and re-run?
I recommend to create two cond environments. One with ITEX another without ITEX. You can compare the performance with/without ITEX.
2024-03-04 16:20:48.661969: I itex/core/wrapper/itex_cpu_wrapper.cc:60] Intel Extension for Tensorflow* AVX512 CPU backend is loaded.
@wswsmao
This demo can run without changing the original code, how to prove that ITEX is effective? You can check the log. There is ITEX log when it is enabled.
2024-03-08 11:44:24.621184: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-03-08 11:44:25.439443: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-03-08 11:44:25.830716: I itex/core/wrapper/itex_cpu_wrapper.cc:60] Intel Extension for Tensorflow AVX512 CPU backend is loaded. 2024-03-08 11:44:26.371964: I itex/core/wrapper/itex_gpu_wrapper.cc:35] Intel Extension for Tensorflow GPU backend is loaded. 2024-03-08 11:44:26.492596: I itex/core/devices/gpu/itex_gpu_runtime.cc:129] Selected platform: Intel(R) Level-Zero 2024-03-08 11:44:26.492973: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device. 2024-03-08 11:44:26.492984: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device. Tensorflow version 2.14.1 2024-03-08 11:44:27.254820: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform XPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2024-03-08 11:44:27.254869: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform XPU ID 1, defaulting to 0. Your kernel may not have been built with NUMA support. 2024-03-08 11:44:27.254900: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: XPU, pci bus id: ) 2024-03-08 11:44:27.255202: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:XPU:1 with 0 MB memory) -> physical PluggableDevice (device: 1, name: XPU, pci bus id: ) 2024-03-08 11:44:27.809896: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type XPU is enabled. Load cat.jpg to inference
- how can I get performance improvement data compared with no ITEX? uninstall ITEX and re-run?
I recommend to create two cond environments. One with ITEX another without ITEX. You can compare the performance with/without ITEX.
@xiguiw ok, I will try it
@wswsmao if you are using venv, please help to switch to conda.
Hi @guizili0 , There is some reasons that I have to use python venv instead of conda, I was wondering if there is any way, such as that pick up the key files in conda and copy them to python venv
@wswsmao if you are using venv, please help to switch to conda.
Hi @guizili0 , There is some reasons that I have to use python venv instead of conda, I was wondering if there is any way, such as that pick up the key files in conda and copy them to python venv
@wswsmao In my understanding, there is a soft-link venv can cause ITEX so file load twice, that would cause this crash issue. You can try to remove this soft-link, and check if the issue is gone.
To find the soft-link, you can use strace -e trace=open,openat python
to dump the so file load log and check the detail location.
You would get logs like below:
openat(AT_FDCWD, "/opt/app-root/lib/python3.9/site-packages/tensorflow-plugins/libitex_cpu.so", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/opt/app-root/lib64/python3.9/site-packages/tensorflow-plugins/libitex_cpu.so", O_RDONLY|O_CLOEXEC) = 3
@wswsmao if you are using venv, please help to switch to conda.
Hi @guizili0 , There is some reasons that I have to use python venv instead of conda, I was wondering if there is any way, such as that pick up the key files in conda and copy them to python venv
@wswsmao In my understanding, there is a soft-link venv can cause ITEX so file load twice, that would cause this crash issue. You can try to remove this soft-link, and check if the issue is gone.
To find the soft-link, you can use
strace -e trace=open,openat python
to dump the so file load log and check the detail location. You would get logs like below:openat(AT_FDCWD, "/opt/app-root/lib/python3.9/site-packages/tensorflow-plugins/libitex_cpu.so", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/opt/app-root/lib64/python3.9/site-packages/tensorflow-plugins/libitex_cpu.so", O_RDONLY|O_CLOEXEC) = 3
@guizili0 It works. I get these:
./itex_venv/lib64/python3.11/site-packages/tensorflow-plugins/libitex_cpu.so
./itex_venv/lib/python3.11/site-packages/tensorflow-plugins/libitex_cpu.so
I remove the so in lib
, though it is same so file rather than soft-link.
Hello, I get similar questions as this issue below https://github.com/intel/intel-extension-for-tensorflow/issues/51
this is report
There are my version
AND,I can not install ITEX 1.2.0
So,I can only install ITEX 2.13.0