intel / intel-graphics-compiler

Other
606 stars 158 forks source link

compute-runtime builds fail to "ocloc" LLVM (v10) cli options usage error #202

Closed eero-t closed 3 years ago

eero-t commented 3 years ago

compute-runtime "21.26.20194" release build fails to incorrect LLVM (v10) CLI options usage:

RUN git clone --branch ${TAG_COMPUTE} --depth 1 https://github.com/intel/compute-runtime.git  &&     cd compute-runtime  &&  mkdir build  &&  cd build  &&      cmake -LH -Wno-dev -G Ninja       -DCMAKE_INSTALL_PREFIX=${INSTALL_DIR} -DCMAKE_BUILD_TYPE=Release       -DSUPPORT_GEN8=0 -DSUPPORT_GEN9=1 -DSUPPORT_GEN11=1       -DSUPPORT_GEN12LP=1 -DSUPPORT_DG1=1       -DDO_NOT_RUN_AUB_TESTS=1 -DDONT_CARE_OF_VIRTUALS=1       ../  &&     ninja
...
[291/2286] Generating ../../../../bin/built_ins/x64/gen12lp/bindful_copy_buffer_rect_Gen12LPlp.spv
FAILED: bin/built_ins/x64/gen12lp/bindful_copy_buffer_rect_Gen12LPlp.spv 
cd /home/nobody/source/compute-runtime/shared/source/built_ins/kernels && LD_LIBRARY_PATH=/home/nobody/source/compute-runtime/build/bin /home/nobody/source/compute-runtime/build/bin/ocloc -q -file copy_buffer_rect.builtin_kernel -spv_only -device tgllp -64 -output bindful_copy_buffer_rect -out_dir /home/nobody/source/compute-runtime/build/bin/built_ins/x64/gen12lp -options -cl-kernel-arg-info
: CommandLine Error: Option 'mc-relax-all' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options

In case it matters, this build happens in Ubuntu 20.10 container with latest IGC release "igc-1.0.7862" built using LLVM v10 just before it (which requires workaround for https://github.com/intel/intel-graphics-compiler/issues/186 bug).

Here's more of the output before that error:

...
-- The C compiler identification is GNU 10.3.0
-- The CXX compiler identification is GNU 10.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done

-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- branch dir list: /
-- WDK Directory: 
-- WDK Version is 
-- Driver model : drm
-- Release build configuration
-- Computed OpenCL version major is: 21
-- Computed OpenCL version minor is: 26
-- GTest repeat count set to 1
-- GTest shuffle set to --gtest_shuffle;--gtest_random_seed=0
-- Source Level Debugger headers dir: /home/nobody/source/compute-runtime/third_party/source_level_debugger
-- Aub Stream Headers dir: /home/nobody/source/compute-runtime/third_party/aub_stream/headers
-- Metrics Library dir: /home/nobody/source/compute-runtime/third_party/metrics_library
-- Metrics Discovery dir: /home/nobody/source/compute-runtime/third_party/metrics_discovery
-- i915 includes dir: /home/nobody/source/compute-runtime/third_party/uapi/dg1
-- Khronos OpenCL headers dir: /home/nobody/source/compute-runtime/third_party/opencl_headers
-- Khronos OpenGL headers dir: /home/nobody/source/compute-runtime/third_party/opengl_headers
-- Third party dir: /home/nobody/source/third_party
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2") 
-- Checking for module 'igc-opencl'
--   Found igc-opencl, version 1.0.1
-- IGC include dirs: /usr/local/include/igc;/usr/local/include/igc/cif;/usr/local/include/igc/ocl_igc_shared/executable_format;/usr/local/include/igc/ocl_igc_shared/device_enqueue
-- VISA Dir: /usr/local/include/visa
-- IGA Includes dir: /usr/local/include/iga
-- Checking for module 'igdgmm'
--   Found igdgmm, version 11.3.0
-- GmmLib include dirs: /usr/local/include/igdgmm;/usr/local/include/igdgmm/GmmLib;/usr/local/include/igdgmm/GmmLib/inc;/usr/local/include/igdgmm/inc;/usr/local/include/igdgmm/inc/common;/usr/local/include/igdgmm/util
-- Checking for module 'libva>=1.0.0'
--   Found libva, version 1.12.0
-- Looking for vaGetLibFunc in va
-- Looking for vaGetLibFunc in va - found
-- Using libva 
-- LibVA include dirs: /usr/local/include
-- AUB_STREAM_DIR = 
-- Engine node dir: /home/nobody/source/compute-runtime/third_party/aub_stream/headers
-- All supported platforms:  TGLLP DG1 RKL ADLS ICLLP LKF EHL SKL KBL GLK CFL BXT
-- All tested platforms:  TGLLP RKL ADLS ICLLP LKF EHL SKL KBL GLK CFL BXT
-- Default supported platform: SKL
-- Default tested platform: SKL
-- All supported core families: GEN9;GEN11;GEN12LP
-- All tested core families: GEN9;GEN11;GEN12LP
-- Default tested family name: SKLFamily
-- Performing Test COMPILER_SUPPORTS_INDIRECT_BRANCH_THUNK
-- Performing Test COMPILER_SUPPORTS_INDIRECT_BRANCH_THUNK - Failed
CMake Warning at CMakeLists.txt:824 (message):
  Spectre mitigation -mindirect-branch=thunk flag is not supported by the
  compiler

-- Performing Test COMPILER_SUPPORTS_FUNCTION_RETURN_THUNK
-- Performing Test COMPILER_SUPPORTS_FUNCTION_RETURN_THUNK - Failed
CMake Warning at CMakeLists.txt:830 (message):
  Spectre mitigation -mfunction-return=thunk flag is not supported by the
  compiler

-- Performing Test COMPILER_SUPPORTS_INDIRECT_BRANCH_REGISTER
-- Performing Test COMPILER_SUPPORTS_INDIRECT_BRANCH_REGISTER - Success
-- All targets will use virtuals

-- GTest exception options set to --gtest_catch_exceptions=1
-- Level Zero driver version: 1.1.0
-- Found LevelZero: /usr/local/include  
-- LevelZero_INCLUDE_DIRS: /usr/local/include
-- Could NOT find LibXml2 (missing: LIBXML2_LIBRARY LIBXML2_INCLUDE_DIR) 
-- LibXml2 Library headers not available. Building without.
-- LibGenl headers not available. Building without
-- Could NOT find libigsc (missing: libigsc_LIBRARIES libigsc_INCLUDE_DIR) 
-- libigsc Library headers not available. Building without
Prebuilt kernels are linked to Level Zero.
-- Configuring done

-- Generating done
-- Build files have been written to: /home/nobody/source/compute-runtime/build
-- Cache values
// allow use of AppVerifier
APPVERIFIER_ALLOWED:BOOL=OFF

// allow use of ccache
CCACHE_ALLOWED:BOOL=ON

// Path to a program.
CCACHE_EXE_FOUND:FILEPATH=CCACHE_EXE_FOUND-NOTFOUND

// OpenCL program binary cache location
CL_CACHE_LOCATION:STRING=cl_cache

// Choose the type of build, options are: None Debug Release RelWithDebInfo MinSizeRel ...
CMAKE_BUILD_TYPE:STRING=Release

// Install path prefix, prepended onto install directories.
CMAKE_INSTALL_PREFIX:PATH=/usr/local

// Enable built-in kernels compilation
COMPILE_BUILT_INS:BOOL=TRUE

// Path to a program.
GIT:FILEPATH=/usr/bin/git

// generate gcov report
IGDRCL_GCOV:BOOL=OFF

// Install udev rules. An attempt to automatically determine the proper location will be made if UDEV_RULES_DIR is not set.
L0_INSTALL_UDEV_RULES:BOOL=OFF

// Use the default/verbose test output
L0_ULT_VERBOSE:BOOL=OFF

// Path to a file.
LIBGENL_INCLUDE_DIR:PATH=LIBGENL_INCLUDE_DIR-NOTFOUND

// Path to a file.
LevelZero_INCLUDE_DIR:PATH=/usr/local/include

// Path to a program.
PYTHON_EXECUTABLE:FILEPATH=/usr/bin/python3

// Use the default/verbose test output
SHOW_VERBOSE_UTESTS_RESULTS:BOOL=OFF

// Support ADLS
SUPPORT_ADLS:BOOL=TRUE

// Support BXT
SUPPORT_BXT:BOOL=TRUE

// Support CFL
SUPPORT_CFL:BOOL=TRUE

// Support GEN11 for device side enqueue
SUPPORT_DEVICE_ENQUEUE_GEN11:BOOL=TRUE

// Support GEN12LP for device side enqueue
SUPPORT_DEVICE_ENQUEUE_GEN12LP:BOOL=TRUE

// Support GEN8 for device side enqueue
SUPPORT_DEVICE_ENQUEUE_GEN8:BOOL=TRUE

// Support GEN9 for device side enqueue
SUPPORT_DEVICE_ENQUEUE_GEN9:BOOL=TRUE

// Support EHL
SUPPORT_EHL:BOOL=TRUE

// Support GEN11 devices
SUPPORT_GEN11:BOOL=1

// Support GEN12LP devices
SUPPORT_GEN12LP:BOOL=1

// Support GEN8 devices
SUPPORT_GEN8:BOOL=0

// Support GEN9 devices
SUPPORT_GEN9:BOOL=1

// default value for SUPPORT_GENx
SUPPORT_GEN_DEFAULT:BOOL=TRUE

// Support GLK
SUPPORT_GLK:BOOL=TRUE

// Support ICLLP
SUPPORT_ICLLP:BOOL=TRUE

// Support KBL
SUPPORT_KBL:BOOL=TRUE

// Support LKF
SUPPORT_LKF:BOOL=TRUE

// default value for support platform
SUPPORT_PLATFORM_DEFAULT:BOOL=TRUE

// Support RKL
SUPPORT_RKL:BOOL=TRUE

// Support SKL
SUPPORT_SKL:BOOL=TRUE

// Support TGLLP
SUPPORT_TGLLP:BOOL=TRUE

// Build ULTs for ADLS
TESTS_ADLS:BOOL=TRUE

// Build ULTs for BXT
TESTS_BXT:BOOL=TRUE

// Build ULTs for CFL
TESTS_CFL:BOOL=TRUE

// Build ULTs for EHL
TESTS_EHL:BOOL=TRUE

// Build ULTs for GEN11 devices
TESTS_GEN11:BOOL=1

// Build ULTs for GEN12LP devices
TESTS_GEN12LP:BOOL=1

// Build ULTs for GEN8 devices
TESTS_GEN8:BOOL=0

// Build ULTs for GEN9 devices
TESTS_GEN9:BOOL=1

// Build ULTs for GLK
TESTS_GLK:BOOL=TRUE

// Build ULTs for ICLLP
TESTS_ICLLP:BOOL=TRUE

// Build ULTs for KBL
TESTS_KBL:BOOL=TRUE

// Build ULTs for LKF
TESTS_LKF:BOOL=TRUE

// Build ULTs for RKL
TESTS_RKL:BOOL=TRUE

// Build ULTs for SKL
TESTS_SKL:BOOL=TRUE

// Build ULTs for TGLLP
TESTS_TGLLP:BOOL=TRUE

// Link with address sanitization support
USE_ASAN:BOOL=OFF

// Use OpenCL program binary cache
USE_CL_CACHE:BOOL=ON

// Build unit tests.
level-zero-gpu_BUILD_TESTS:BOOL=ON

// Path to a file.
libigsc_INCLUDE_DIR:PATH=libigsc_INCLUDE_DIR-NOTFOUND

// Path to a library.
libigsc_LIBRARIES:FILEPATH=libigsc_LIBRARIES-NOTFOUND

[1/2286] Building CXX object offline_compiler/source/CMakeFiles/ocloc.dir/main.cpp.o
...
[290/2286] Generating ../../../../bin/built_ins/x64/gen12lp/bindful_aux_translation_Gen12LPlp.spv
FAILED: bin/built_ins/x64/gen12lp/bindful_aux_translation_Gen12LPlp.spv 
cd /home/nobody/source/compute-runtime/shared/source/built_ins/kernels && LD_LIBRARY_PATH=/home/nobody/source/compute-runtime/build/bin /home/nobody/source/compute-runtime/build/bin/ocloc -q -file aux_translation.builtin_kernel -spv_only -device tgllp -64 -output bindful_aux_translation -out_dir /home/nobody/source/compute-runtime/build/bin/built_ins/x64/gen12lp -options -cl-kernel-arg-info
: CommandLine Error: Option 'mc-relax-all' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
[291/2286] Generating ../../../../bin/built_ins/x64/gen12lp/bindful_copy_buffer_rect_Gen12LPlp.spv
FAILED: bin/built_ins/x64/gen12lp/bindful_copy_buffer_rect_Gen12LPlp.spv 
cd /home/nobody/source/compute-runtime/shared/source/built_ins/kernels && LD_LIBRARY_PATH=/home/nobody/source/compute-runtime/build/bin /home/nobody/source/compute-runtime/build/bin/ocloc -q -file copy_buffer_rect.builtin_kernel -spv_only -device tgllp -64 -output bindful_copy_buffer_rect -out_dir /home/nobody/source/compute-runtime/build/bin/built_ins/x64/gen12lp -options -cl-kernel-arg-info
: CommandLine Error: Option 'mc-relax-all' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
eero-t commented 3 years ago

Same problem with the new "21.27.20266" release.

jbeich commented 3 years ago

Ditto on FreeBSD with 21.32.20609 (igc-1.0.8365) using llvm10 but llvm11 is fine.

JacekDanecki commented 3 years ago

Have you build IGC with system llvm, or with llvm sources?

eero-t commented 3 years ago

It's system LLVM from Ubuntu.

JacekDanecki commented 3 years ago

What about opencl-clang and SPIRV-LLVM-Translator? Have you built them or used from the system? If you built them yourself what method did you use? (In-tree build or Out-of-tree build)

eero-t commented 3 years ago

Those are also from the system.

eero-t commented 3 years ago

Same problem also with latest IGC (igc-1.0.8517) and compute-runtime (21.34.20767).

Compute-runtime (still) does not build with LLVM v10 (in Ubuntu 20.10) any more: apt install clang-10 libopencl-clang-dev opencl-headers llvm-10-dev liblld-10-dev llvm-spirv libllvmspirvlib-dev

but it does build with LLVM v11 (in Ubuntu 21.04): apt install clang-11 libopencl-clang-dev opencl-headers llvm-11-dev liblld-11-dev llvm-spirv libllvmspirvlib-dev

(Using -DIGC_OPTION__LLVM_PREFERRED_VERSION=11 instead of -DIGC_OPTION__LLVM_PREFERRED_VERSION=10, with -DINSTALL_SPIRVDLL=0option naturally still remaining with IGC. )

JacekDanecki commented 3 years ago

This is not Neo problem, but how opencl-clang library was compiled on Ubuntu, and I suppose the same is on FreeBSD. To workaround this issue you need to recompile opencl-clang and llvm-spriv-translator with llvm sources using In-tree method. You don't have to recompile IGC after opencl-clang/llvm-spirv-translator build. This is not specific to llvm10, but was observed in previous llvm versions. Similar issues were observed in another projects using llvm static libraries on Ubuntu.

eero-t commented 3 years ago

This is not specific to llvm10, but was observed in previous llvm versions. Similar issues were observed in another projects using llvm static libraries on Ubuntu.

Any idea why it worked earlier with the same Ubuntu (20.10) version of LLVM v10, and still works with Ubuntu (21.04) LLVM v11?

JacekDanecki commented 3 years ago

As the issue is on compiler side, transferring it to IGC project.

eero-t commented 3 years ago

This is not Neo problem, but how opencl-clang library was compiled on Ubuntu, and I suppose the same is on FreeBSD. To workaround this issue you need to recompile opencl-clang and llvm-spriv-translator with llvm sources using In-tree method.

With LLVM sources? But the whole point of this bug is using system libraries, to avoid compiling the world (faster builds, smaller binaries and memory usage when multiple things share the same libs, easier security updates etc).

Bugs for other projects with similar issues seem to be because they link multiple versions of LLVM, either different versions, or both dynamic & static versions. In my case there's only single LLVM version installed in the build container, but the LLVM dev packages brings in both dynamic & static libraries.

@JacekDanecki Are you saying that when being requested to use system versions of libraries, IGC / compute-runtime choose to use static libraries, whereas Debian/Ubuntu packages use dynamic libraries?

If yes, how things then work with Ubuntu (21.04) LLVM v11? Nothing in that respect should have changed, new Debian/Unbuntu packages have just dropped their last patches for these projects:

Or if you just mean that newer versions of opencl-clang & llvm spriv lib are needed, then that's compute-runtime bug of not checking versions of its dependencies...

JacekDanecki commented 3 years ago

Compute runtime doesn't use llvm directly, it loads IGC libraries. This is why I've moved this issue to IGC project. IGC and its dependencies can be compiled using different methods (with llvm sources, dynamic or static libraries), but for some reasons under Ubuntu there are problems with opencl-clang when it is compiled separately with system libraries. I've not observed such issues on another distributions. I understand you want to use system libraries for many reasons, so I've only provided workaround with compiling opencl-clang with llvm sources. Real fix should be provided by opencl-clang or IGC team. Compute runtime was tested with specific IGC version and its dependencies, and this list is provided in Neo release notes. Deb packages provided on Neo github have set specific gmmlib and IGC versions, as these are main Neo dependencies. opencl-clang and llvm-spirv versions used to build IGC with llvm sources are provided in IGC release notes. In case of llvm11 let's wait for answer from compiler teams.

wenju-he commented 3 years ago

This is an issue in LLVM 10. It is fixed in LLVM 11 by https://reviews.llvm.org/D75579 Please check the commit message for details.

opencl-clang and IGC libraries link to both libclang-cpp.so.10 and libLLVM-10.so.1, thus there is double registration of "mc-relax-all"

Cherry-picking it to llvm-10 requires cherry-picking https://reviews.llvm.org/D68063 as well and solving a couple of merge conflicts. After these changes, the error is gone and compute-runtime/build/bin/built_ins/x64/gen12lp/bindful_copy_buffer_rect_Gen12LPlp.spv is successfully generated.

pszymich commented 3 years ago

I'm closing this one as it was identified as external issue. We now offer full LLVM 11 support, so you can use such configuration to avoid the problem.

eero-t commented 3 years ago

I'm closing this one as it was identified as external issue. We now offer full LLVM 11 support, so you can use such configuration to avoid the problem.

IMHO CMake could check that LLVM is supported version i.e. explicitly reject too old (or new) versions.

And Top level README.md could point out the LLVM support info, e.g. in the Depenencies section, like this: