intel / intel-extension-for-tensorflow

Intel® Extension for TensorFlow*
Other
317 stars 40 forks source link

Failure of ITEX compilation from source code #43

Closed YanfeiXu closed 1 year ago

YanfeiXu commented 1 year ago

Hi experts, As you can see below, build stage is reporting the URL for LLVM are not available. I'm using main branch of this repo. Any fix on this already?

(itex_build) [root@9c938b446fc0 intel-extension-for-tensorflow]# bazel build -c opt --config=gpu //itex/tools/pip_package:build_pip_package WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/ac1ec9e2904a696e360b40572c3b3c29d67981ef.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/7cbf1a2591520c2491aa35339f227775f4d3adf6.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found INFO: Repository local_config_cc instantiated at: /DEFAULT.WORKSPACE.SUFFIX:519:13: in /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/cc_configure.bzl:184:16: in cc_configure Repository rule cc_autoconf defined at: /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/cc_configure.bzl:145:30: in ERROR: An error occurred during the fetch of repository 'local_config_cc': Traceback (most recent call last):

guizili0 commented 1 year ago

@YanfeiXu for each resource, we provide 2 links like https://github.com/intel/intel-extension-for-tensorflow/blob/main/third_party/llvm_project/workspace.bzl#L23. The github one should work for you. And the LLVM one just warning and no error repot for the github one.

From the log, seems you has issue to get the bazel tool. can you help to share /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/command.log for this build issue?

YanfeiXu commented 1 year ago

Ha, got it. Thanks for quick response. Below is the top part content of command.log you required. Seems it caused by my docker container doesn't set the CC environment variable? So I need install gcc first? What I did all are following this page: https://intel.github.io/intel-extension-for-tensorflow/latest/docs/install/how_to_build.html

^[[32mLoading:^[[0m ^M^[[1A^[[K^[[32mLoading:^[[0m 0 packages loaded ^M^[[1A^[[K^[[35mWARNING: ^[[0mDownload from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/ac1ec9e2904a696e360b40572c3b3c29d67981ef.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found ^[[32mAnalyzing:^[[0m target //itex/tools/pip_package:build_pip_package (0 packages loaded) ^M^[[1A^[[K^[[35mWARNING: ^[[0mDownload from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/7cbf1a2591520c2491aa35339f227775f4d3adf6.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found ^[[32mAnalyzing:^[[0m target //itex/tools/pip_package:build_pip_package (0 packages loaded) ^M^[[1A^[[K^[[32mAnalyzing:^[[0m target //itex/tools/pip_package:build_pip_package (0 packages loaded, 0 targets configured) ^M^[[1A^[[K^[[32mINFO: ^[[0mRepository local_config_cc instantiated at: /DEFAULT.WORKSPACE.SUFFIX:519:13: in /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/cc_configure.bzl:184:16: in cc_configure Repository rule cc_autoconf defined at: /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/cc_configure.bzl:145:30: in ^[[32mAnalyzing:^[[0m target //itex/tools/pip_package:build_pip_package (0 packages loaded, 0 targets configured) currently loading: @com_google_protobuf// ... (2 packages) Fetching @local_config_python; fetching Fetching @pybind11; fetching Fetching @local_config_tf; fetching Fetching @com_google_absl; fetching Fetching @local_config_cc; fetching Fetching @rules_pkg; fetching ^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^[[31m^[[1mERROR: ^[[0mAn error occurred during the fetch of repository 'local_config_cc': Traceback (most recent call last): File "/root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/cc_configure.bzl", line 127, column 33, in cc_autoconf_impl configure_unix_toolchain(repository_ctx, cpu_value, overriden_tools) File "/root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/unix_cc_configure.bzl", line 345, column 17, in configure_unix_toolchain cc = find_cc(repository_ctx, overriden_tools) File "/root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/unix_cc_configure.bzl", line 310, column 23, in find_cc cc = _find_generic(repository_ctx, "gcc", "CC", overriden_tools) File "/root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/unix_cc_configure.bzl", line 306, column 32, in _find_generic auto_configure_fail(msg) File "/root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/cpp/lib_cc_configure.bzl", line 112, column 9, in auto_configure_fail fail("\n%sAuto-Configuration Error:%s %s\n" % (red, no_color, msg)) Error in fail: ^[[0;31mAuto-Configuration Error:^[[0m Cannot find gcc or CC; either correct your path or set the CC environment variable ^[[32mAnalyzing:^[[0m target //itex/tools/pip_package:build_pip_package (0 packages loaded, 0 targets configured) currently loading: @com_google_protobuf// ... (2 packages) Fetching @local_config_python; fetching Fetching @pybind11; fetching Fetching @local_config_tf; fetching Fetching @com_google_absl; fetching Fetching @rules_pkg; fetching

guizili0 commented 1 year ago

Yes, please install gcc first. Seems we did a wrong assumption that user has gcc installed.

YanfeiXu commented 1 year ago

After installing gcc&g++, a new failure appears during compiling. Any suggestion?

Compiling itex/core/utils/device_types.cc; 0s local
Compiling absl/base/internal/raw_logging.cc; 0s local
Compiling absl/numeric/int128.cc; 0s local
Compiling itex/core/utils/device_types.cc; 0s local
Compiling absl/base/log_severity.cc; 0s local
Compiling src/google/protobuf/any_lite.cc; 0s local
Compiling src/google/protobuf/compiler/cpp/message.cc; 0s local
Compiling src/google/protobuf/arena.cc; 0s local ...

^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^[[31m^[[1mERROR: ^[[0m/intel-extension-for-tensorflow/itex/core/utils/BUILD:145:11: Compiling itex/core/utils/device_types.cc failed: (Exit 1): crosstool_wrapper_driver failed: error executing command external/local_config_dpcpp/crosstool_dpcpp/bin/crosstool_wrapper_driver -MD -MF bazel-out/k8-opt/bin/itex/core/utils/_objs/device_types/device_types.pic.d ... (remaining 52 arguments skipped) gcc: warning: : linker input file unused because linking not done gcc: error: : linker input file not found: No such file or directory ^[[32m[13 / 1,010]^[[0m 143 actions, 139 running Compiling absl/base/internal/raw_logging.cc; 0s local Compiling absl/numeric/int128.cc; 0s local Compiling itex/core/utils/device_types.cc; 0s local Compiling absl/base/log_severity.cc; 0s local Compiling src/google/protobuf/any_lite.cc; 0s local Compiling src/google/protobuf/compiler/cpp/message.cc; 0s local Compiling src/google/protobuf/arena.cc; 0s local Compiling src/google/protobuf/compiler/objectivec/objectivec_field.cc; 0s local ... ^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[K^M^[[1A^[[KTarget //itex/tools/pip_package:build_pip_package failed to build ^[[32m[156 / 1,010]^[[0m checking cached actions ^M^[[1A^[[KUse --verbose_failures to see the command lines of failed build steps. ^[[32m[156 / 1,010]^[[0m checking cached actions ^M^[[1A^[[K^[[32mINFO: ^[[0mElapsed time: 1.514s, Critical Path: 0.44s ^[[32m[156 / 1,010]^[[0m checking cached actions ^M^[[1A^[[K^[[32mINFO: ^[[0m156 processes: 156 internal. ^[[32m[156 / 1,010]^[[0m checking cached actions ^M^[[1A^[[K^[[31m^[[1mFAILED:^[[0m Build did NOT complete successfully ^M^[[1A^[[K^[[31m^[[1mFAILED:^[[0m Build did NOT complete successfully

guizili0 commented 1 year ago

can you help to share the result with: bazel build -c opt -s --config=gpu //itex/tools/pip_package:build_pip_package

YanfeiXu commented 1 year ago

Sure! But the output is very extensive. I only paste the head and bottom of the log. BTW, the error part is in bottom.

(itex_build) [root@9c938b446fc0 intel-extension-for-tensorflow]# bazel build -c opt -s --config=gpu  //itex/tools/pip_package:build_pip_package
WARNING: /intel-extension-for-tensorflow/itex/core/utils/protobuf/BUILD:33:17: in cc_library rule //itex/core/utils/protobuf:for_core_protos_cc_impl: target '//itex/core/utils/protobuf:for_core_protos_cc_i
mpl' depends on deprecated target '@com_google_protobuf//:cc_wkt_protos': Only for backward compatibility. Do not use.
WARNING: /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/local_config_tf/BUILD:6835:8: target 'xla_extension.so' is both a rule and a file; please choose another name for the rule
DEBUG: /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/build_defs/repo/git_worker.bzl:84:14: git.bzl: Cloning or updating  (--depth=1) repository onednn_gpu using
 strip_prefix of [None]
DEBUG: Rule 'onednn_gpu' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1689891356 -0700"
DEBUG: Repository onednn_gpu instantiated at:
  /intel-extension-for-tensorflow/WORKSPACE:21:15: in 
  /intel-extension-for-tensorflow/itex/workspace.bzl:197:23: in itex_workspace
Repository rule new_git_repository defined at:
  /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/build_defs/repo/git.bzl:186:37: in 
DEBUG: /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/build_defs/repo/git_worker.bzl:84:14: git.bzl: Cloning or updating  (--depth=1) repository xetla using stri
p_prefix of [None]
DEBUG: Rule 'xetla' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1684496521 +0800"
DEBUG: Repository xetla instantiated at:
  /intel-extension-for-tensorflow/WORKSPACE:21:15: in 
  /intel-extension-for-tensorflow/itex/workspace.bzl:297:23: in itex_workspace
Repository rule new_git_repository defined at:
  /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/bazel_tools/tools/build_defs/repo/git.bzl:186:37: in 
INFO: Analyzed target //itex/tools/pip_package:build_pip_package (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
SUBCOMMAND: # @com_google_protobuf//:protobuf_native [action 'Compiling src/google/protobuf/empty.pb.cc', configuration: 9fd48c4261ff2774f1d84497e449b9e6c2c46bc6763c3d1ee85312ce1432d3e3, execution platform
: @local_config_platform//:host]
(cd /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/execroot/intel_extension_for_tensorflow && \
  exec env - \
    LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/2023.1.0/lib/intel64:/opt/intel/oneapi/compiler/2023.1.0/linux/lib:/opt/intel/oneapi/compiler/2023.1.0/linux/lib/x64:/opt/intel/oneapi/compiler/2023.1.0/linux/lib/
oclfpga/host/linux64/lib:/opt/intel/oneapi/compiler/2023.1.0/linux/compiler/lib/intel64_lin \
    PATH=/opt/intel/oneapi/mkl/2023.1.0/bin/intel64:/opt/intel/oneapi/compiler/2023.1.0/linux/lib/oclfpga/bin:/opt/intel/oneapi/compiler/2023.1.0/linux/bin/intel64:/opt/intel/oneapi/compiler/2023.1.0/linux
/bin:/root/miniconda3/envs/itex_build/bin:/root/miniconda3/condabin:/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
    PWD=/proc/self/cwd \

----------omit omit omit------------

    PWD=/proc/self/cwd \
  external/local_config_dpcpp/crosstool_dpcpp/bin/crosstool_wrapper_driver -MD -MF bazel-out/k8-opt-exec-2B5CBBC6-ST-f8fb2d02ddb2/bin/external/llvm-project/llvm/_objs/Support/AArch64TargetParser.d '-frando
m-seed=bazel-out/k8-opt-exec-2B5CBBC6-ST-f8fb2d02ddb2/bin/external/llvm-project/llvm/_objs/Support/AArch64TargetParser.o' '-DLLVM_ON_UNIX=1' '-DHAVE_BACKTRACE=1' '-DBACKTRACE_HEADER=' '-DLTDL_S
HLIB_EXT=".so"' '-DLLVM_PLUGIN_EXT=".so"' '-DLLVM_ENABLE_THREADS=1' '-DHAVE_DEREGISTER_FRAME=1' '-DHAVE_LIBPTHREAD=1' '-DHAVE_PTHREAD_GETNAME_NP=1' '-DHAVE_PTHREAD_H=1' '-DHAVE_PTHREAD_SETNAME_NP=1' '-DHAV
E_REGISTER_FRAME=1' '-DHAVE_SETENV_R=1' '-DHAVE_STRERROR_R=1' '-DHAVE_SYSEXITS_H=1' '-DHAVE_UNISTD_H=1' -D_GNU_SOURCE '-DHAVE_LINK_H=1' '-DHAVE_LSEEK64=1' '-DHAVE_MALLINFO=1' '-DHAVE_SBRK=1' '-DHAVE_STRUCT
_STAT_ST_MTIM_TV_NSEC=1' '-DLLVM_NATIVE_ARCH="X86"' '-DLLVM_NATIVE_ASMPARSER=LLVMInitializeX86AsmParser' '-DLLVM_NATIVE_ASMPRINTER=LLVMInitializeX86AsmPrinter' '-DLLVM_NATIVE_DISASSEMBLER=LLVMInitializeX86
Disassembler' '-DLLVM_NATIVE_TARGET=LLVMInitializeX86Target' '-DLLVM_NATIVE_TARGETINFO=LLVMInitializeX86TargetInfo' '-DLLVM_NATIVE_TARGETMC=LLVMInitializeX86TargetMC' '-DLLVM_NATIVE_TARGETMCA=LLVMInitializ
eX86TargetMCA' '-DLLVM_HOST_TRIPLE="x86_64-unknown-linux-gnu"' '-DLLVM_DEFAULT_TARGET_TRIPLE="x86_64-unknown-linux-gnu"' -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS '-DBLAKE3_USE_
NEON=0' -iquote external/llvm-project -iquote bazel-out/k8-opt-exec-2B5CBBC6-ST-f8fb2d02ddb2/bin/external/llvm-project -iquote external/llvm_terminfo -iquote bazel-out/k8-opt-exec-2B5CBBC6-ST-f8fb2d02ddb2/
bin/external/llvm_terminfo -iquote external/llvm_zlib -iquote bazel-out/k8-opt-exec-2B5CBBC6-ST-f8fb2d02ddb2/bin/external/llvm_zlib -isystem external/llvm-project/llvm/include -isystem bazel-out/k8-opt-exe
c-2B5CBBC6-ST-f8fb2d02ddb2/bin/external/llvm-project/llvm/include '-std=c++17' -isystem /opt/intel/oneapi/compiler/2023.1.0/linux/include/sycl -isystem /opt/intel/oneapi/compiler/2023.1.0/linux/include -iq
uote /intel-extension-for-tensorflow/third_party/build_option/dpcpp/runtime/ '' -fPIC '-DITEX_USE_MKL=1' '-DITEX_ENABLE_DOUBLE=1' '-DEIGEN_USE_DPCPP=1' '-DEIGEN_USE_GPU=1' '-DEIGEN_USE_DPCPP_BUILD=1' '-DEI
GEN_USE_DPCPP_USM=1' '-DDNNL_USE_DPCPP_USM=1' '-DDNNL_WITH_LEVEL_ZERO=1' '-DNGEN_NO_OP_NAMES=1' '-DNGEN_CPP11=1' '-DNGEN_SAFE=1' '-DNGEN_NEO_INTERFACE=1' '-DDNNL_X64=1' '-DEIGEN_HAS_C99_MATH=1' '-DEIGEN_HA
S_CXX11_MATH=1' -Wno-unused-variable -Wno-unused-const-variable -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -no-canonical-prefixes -fPIE -U_FOR
TIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -DNDEBUG -O3 -ffunction-sections -fdata-sections -g0 -g0 '-fvisibility=hidden' -DINTEL_GPU_ONLY -c e
xternal/llvm-project/llvm/lib/Support/AArch64TargetParser.cpp -o bazel-out/k8-opt-exec-2B5CBBC6-ST-f8fb2d02ddb2/bin/external/llvm-project/llvm/_objs/Support/AArch64TargetParser.o)
# Configuration: e25b7d089aa6000e20085b13011363d8cbeb12add20399c58182e175c3477072
# Execution platform: @local_config_platform//:host
ERROR: /root/.cache/bazel/_bazel_root/d63feaeda59c5b25ba5654fb89a01310/external/zlib/BUILD.bazel:45:18: Compiling adler32.c failed: (Exit 1): crosstool_wrapper_driver failed: error executing command extern
al/local_config_dpcpp/crosstool_dpcpp/bin/crosstool_wrapper_driver -MD -MF bazel-out/k8-opt-exec-2B5CBBC6-ST-f8fb2d02ddb2/bin/external/zlib/_objs/zlib_native/adler32.d ... (remaining 40 arguments skipped)
gcc: warning: : linker input file unused because linking not done
gcc: error: : linker input file not found: No such file or directory
ERROR: /intel-extension-for-tensorflow/itex/core/utils/BUILD:145:11: Compiling itex/core/utils/device_types.cc failed: (Exit 1): crosstool_wrapper_driver failed: error executing command external/local_conf
ig_dpcpp/crosstool_dpcpp/bin/crosstool_wrapper_driver -MD -MF bazel-out/k8-opt/bin/itex/core/utils/_objs/device_types/device_types.pic.d ... (remaining 52 arguments skipped)
gcc: warning: : linker input file unused because linking not done
gcc: error: : linker input file not found: No such file or directory
Target //itex/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 1.517s, Critical Path: 0.94s
INFO: 144 processes: 144 internal.
FAILED: Build did NOT complete successfully
(itex_build) [root@9c938b446fc0 intel-extension-for-tensorflow]#
yinghu5 commented 1 year ago

Hi Yanfei, @YanfeiXu

Just curious what kind of dGPU are you using? If possible, could you please try to install the ITEX directly and see if it works?

and if have to build from source, do you have all requirements like GPU driver, oneAPI base toolkits etc installed as

https://intel.github.io/intel-extension-for-tensorflow/latest/docs/install/how_to_build.html ?

YanfeiXu commented 1 year ago

Hi Yanfei, @YanfeiXu

Just curious what kind of dGPU are you using? If possible, could you please try to install the ITEX directly and see if it works?

and if have to build from source, do you have all requirements like GPU driver, oneAPI base toolkits etc installed as

https://intel.github.io/intel-extension-for-tensorflow/latest/docs/install/how_to_build.html ?

Finally I had built successfully in docker container provided by Wenjun. Look like basic libraries relevant to dGPU were missed in my CentOS environment.