llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.72k stars 11.41k forks source link

Inconsistency in commandline options with multiple OpenCL vendor libraries installed #29935

Open llvmbot opened 7 years ago

llvmbot commented 7 years ago
Bugzilla Link 30587
Version trunk
OS Linux
Reporter LLVM Bugzilla Contributor
CC @AnastasiaStulova,@foutrelis,@Oblomov,@jdm,@pjaaskel,@v-fox

Extended Description

I used llvm+clang 3.9 to build several OpenCL vendor libraries, using ocl-icd as loader. The libs I tested where mesa, beignet and pocl.

Any combination of more than one of those libs installed, results in below error when trying to use an OpenCL application (tested with clinfo, https://github.com/Oblomov/clinfo).

============= : CommandLine Error: Option 'enable-value-profiling' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options

When building the same libs with llvm+clang 3.8, everything works as expected.

v-fox commented 6 years ago

This seems to be related to llvm/llvm-project#23326 —possible blocker?

In #​22952 an option to somehow use LLVM_DYLIB_COMPONENTS is mentioned. Is there a known way to workaround this issue with it ? For all GNU/Linux distributions to not be able to provide universal hardware support (by aforementioned Mesa/Clover for AMD GPUs, Beignet for Intel GPUs and POCL for CPUs) on x86 because of this is quite ridiculous. Saying that, shouldn't staff from AMD and Intel be heavily involved in fixing this ?

People at https://cgit.freedesktop.org/mesa/mesa/log/src/gallium/state_trackers/clover/llvm https://cgit.freedesktop.org/beignet/log/backend/src/llvm https://github.com/pocl/pocl/commits/master/lib/llvmopencl Mainly Marek Olšák marek.olsak@amd.com Nicolai Hähnle nicolai.haehnle@amd.com Jan Vesely jan.vesely@rutgers.edu Yang Rong rong.r.yang@intel.com and Pekka Jääskeläinen here.

AnastasiaStulova commented 6 years ago

I will try to re-classify this bug hoping it will get the attention of the right people.

It feels something fundamental in LLVM design. Perhaps it would make sense to start a thread on llvm-dev. Please CC me if you do!

fe221a9f-9db9-44e9-9909-246d58be1e1f commented 6 years ago

An additional piece of information: the situation now seems to be the reverse of what it used to be up to LLVM 3.8

It used to be that for multiple OpenCL ICD depending on LLVM to work, they had to dynamically link to the same version of the library, or a number of context-related functions would fail.

Now, if they all depend on the same LLVM version, the mentioned multiple-registration error aborts any program trying to use OpenCL, but if each ICD links to a different version, it works.

I currently have Debian sid's Mesa ICD, which uses version 5. If I build POCL using version 4 (for example) and Beignet with version 3.8 (for example), it works. If they are all built with version 5, it the infamous error about multiple registered options appears.

fe221a9f-9db9-44e9-9909-246d58be1e1f commented 6 years ago

This seems to be related to llvm/llvm-project#23326 —possible blocker?

fe221a9f-9db9-44e9-9909-246d58be1e1f commented 7 years ago

I'm not sure this is only an issue of mixed dynamic and static linking. I am seeing this issue even with OpenCL ICDs that link dynamically to the same LLVM version. I'm on Debian unstable and I'm using the distribution LLVM 4.0 development packages. I compile both pocl and beignet specifying LLVM 4.0, and verify that they are using the same library with ldd (which shows libLLVM-4.0.so.1 for both libgbe and pocl). Trying to run clinfo fails with the error

: CommandLine Error: Option 'enable-value-profiling' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options

This is a pretty severe issue for any distribution shipping multiple OpenCL ICDs, as most FLOSS OpenCL platforms depend on LLVM and Clang for their OpenCL support. For example, I have Mesa (distro), Beignet and POCL (plus a few non-free ones).

Up until LLVM 3.7, and possibly 3.8, it used to be that as along as all platforms linked (dynamically) to the same LLVM version, then they could co-exist. This is not the case anymore. Hence, this is a regression that effectively prevents dynamic linking for ICDs (or any similar plugin system).

Debian maintainers plan on statically linking LLVM to avoid this issue, but at this point there's no guarantee that this will be sufficient, and honestly, having three statically linked copies of LLVM just to work around this issue is a bit excessive.

pjaaskel commented 7 years ago

The command line option error is the classical problem one gets due to the LLVM's global object-based command line switch registration. If you somehow link the same global object (that registers the command line option) twice via dynamic loading of the LLVM library twice or more to the same process, it registers the same command line object again (when calling the global object initializer), resulting in this error.

I think here it happens via ICD: It loads the library a) which gets the LLVM's command line switch object registered first, then it loads library b) which also uses the same LLVM and has the same command line global object linked in statically.

I have typically preferred linking to LLVM lib dynamically due to this issue among others. Then you get the switch object linked in only once per process thanks to the dynamic loader detecting the LLVM lib has already been loaded by a) when b) requests it.

This is not a bug in LLVM as such, unless one considers the global object based command line switch registration as such. This error would probably just go away if the command line handler just ignored multiple identical command line switch registrations silently.

AnastasiaStulova commented 7 years ago

Since the failing option comes from the CodeGen lib I classified the component accordingly.

As I can see that option has been added more than a year ago. It uses a standard way for internal options.

I am still not convinced that the issue is not in the wrong use of Clang libraries. I believe that there are multiple of the same libraries instances that register the option in the same address space. And therefore the error is reported. I don't think this is something that should happen though. Particularly when you mean loading multiple OpenCL implementations, does this imply multiple instances of the same Clang libraries linked together? It might be that it has worked before just by chance and the issue started to be exposed after the that CodeGen option was added. But I would wait for the final assessment from whoever knows more about that flag or the use of multiple Clang/LLVM libs together.

llvmbot commented 7 years ago

Trying my best here to provide as much information as possible, but I'm not really sure what I'm looking for, so if you need anything specific, please let me know!

ocl-icd acts as a sort wrapper around different opencl implementation, so you can have multiple implementations installed and the one best matching your system will be picked at runtime. This is an important concept for linux distros who want to ship for a broad range of hardware.

Now, I'll try to explain what happens on the example of beignet, the opencl implementation for Intel GPUs. beignet ships a library called libgbe.so. The link command of this library looks like this (sorry, long):

=================================== /usr/libexec/icecc/bin/c++ -fPIC -O2 -fPIC -funroll-loops -fstrict-aliasing -msse2 -msse3 -mssse3 -msse4.1 -fPIC -Wall -mfpmath=sse -Wcast-align -Wl,-E -std=c++0x -Wno-invalid-offsetof -fno-rtti -I/usr/include -D_GNU_SOURCE -DSTDC_CONSTANT_MACROS -DSTDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DGBE_DEBUG_MEMORY=0 -DGBE_COMPILER_AVAILABLE=1 -fvisibility=hidden -O2 -DNDEBUG -DGBE_DEBUG=0 -Wl,-Bsymbolic -Wl,--no-undefined -L/usr/lib64 -shared -Wl,-soname,libgbe.so -o libgbe.so CMakeFiles/gbe.dir/sys/intrusive_list.cpp.o CMakeFiles/gbe.dir/sys/assert.cpp.o CMakeFiles/gbe.dir/sys/alloc.cpp.o CMakeFiles/gbe.dir/sys/mutex.cpp.o CMakeFiles/gbe.dir/sys/platform.cpp.o CMakeFiles/gbe.dir/sys/cvar.cpp.o CMakeFiles/gbe.dir/ir/context.cpp.o CMakeFiles/gbe.dir/ir/profile.cpp.o CMakeFiles/gbe.dir/ir/type.cpp.o CMakeFiles/gbe.dir/ir/unit.cpp.o CMakeFiles/gbe.dir/ir/constant.cpp.o CMakeFiles/gbe.dir/ir/sampler.cpp.o CMakeFiles/gbe.dir/ir/image.cpp.o CMakeFiles/gbe.dir/ir/half.cpp.o CMakeFiles/gbe.dir/ir/instruction.cpp.o CMakeFiles/gbe.dir/ir/liveness.cpp.o CMakeFiles/gbe.dir/ir/register.cpp.o CMakeFiles/gbe.dir/ir/function.cpp.o CMakeFiles/gbe.dir/ir/value.cpp.o CMakeFiles/gbe.dir/ir/lowering.cpp.o CMakeFiles/gbe.dir/ir/profiling.cpp.o CMakeFiles/gbe.dir/ir/printf.cpp.o CMakeFiles/gbe.dir/ir/immediate.cpp.o CMakeFiles/gbe.dir/ir/structurizer.cpp.o CMakeFiles/gbe.dir/ir/reloc.cpp.o CMakeFiles/gbe.dir/backend/context.cpp.o CMakeFiles/gbe.dir/backend/program.cpp.o CMakeFiles/gbe.dir/llvm/llvm_sampler_fix.cpp.o CMakeFiles/gbe.dir/llvm/llvm_bitcode_link.cpp.o CMakeFiles/gbe.dir/llvm/llvm_gen_backend.cpp.o CMakeFiles/gbe.dir/llvm/llvm_passes.cpp.o CMakeFiles/gbe.dir/llvm/llvm_scalarize.cpp.o CMakeFiles/gbe.dir/llvm/llvm_intrinsic_lowering.cpp.o CMakeFiles/gbe.dir/llvm/llvm_barrier_nodup.cpp.o CMakeFiles/gbe.dir/llvm/llvm_printf_parser.cpp.o CMakeFiles/gbe.dir/llvm/llvm_profiling.cpp.o CMakeFiles/gbe.dir/llvm/ExpandConstantExpr.cpp.o CMakeFiles/gbe.dir/llvm/ExpandUtils.cpp.o CMakeFiles/gbe.dir/llvm/PromoteIntegers.cpp.o CMakeFiles/gbe.dir/llvm/ExpandLargeIntegers.cpp.o CMakeFiles/gbe.dir/llvm/llvm_device_enqueue.cpp.o CMakeFiles/gbe.dir/llvm/StripAttributes.cpp.o CMakeFiles/gbe.dir/llvm/llvm_to_gen.cpp.o CMakeFiles/gbe.dir/llvm/llvm_loadstore_optimization.cpp.o CMakeFiles/gbe.dir/llvm/llvm_unroll.cpp.o CMakeFiles/gbe.dir/backend/gen/gen_mesa_disasm.c.o CMakeFiles/gbe.dir/backend/gen_insn_selection.cpp.o CMakeFiles/gbe.dir/backend/gen_insn_selection_optimize.cpp.o CMakeFiles/gbe.dir/backend/gen_insn_scheduling.cpp.o CMakeFiles/gbe.dir/backend/gen_insn_selection_output.cpp.o CMakeFiles/gbe.dir/backend/gen_reg_allocation.cpp.o CMakeFiles/gbe.dir/backend/gen_context.cpp.o CMakeFiles/gbe.dir/backend/gen75_context.cpp.o CMakeFiles/gbe.dir/backend/gen8_context.cpp.o CMakeFiles/gbe.dir/backend/gen9_context.cpp.o CMakeFiles/gbe.dir/backend/gen_program.cpp.o CMakeFiles/gbe.dir/backend/gen_insn_compact.cpp.o CMakeFiles/gbe.dir/backend/gen_encoder.cpp.o CMakeFiles/gbe.dir/backend/gen7_encoder.cpp.o CMakeFiles/gbe.dir/backend/gen75_encoder.cpp.o CMakeFiles/gbe.dir/backend/gen8_encoder.cpp.o CMakeFiles/gbe.dir/backend/gen9_encoder.cpp.o -ldrm_intel -ldrm -ldrm -Wl,-Bstatic -lclangFrontend -lclangSerialization -lclangDriver -lclangCodeGen -lclangSema -lclangStaticAnalyzerFrontend -lclangStaticAnalyzerCheckers -lclangStaticAnalyzerCore -lclangAnalysis -lclangEdit -lclangAST -lclangParse -lclangSema -lclangLex -lclangBasic -Wl,-Bdynamic -lLLVM-3.9 -lrt -ldl -ltinfo -lpthread -lz -lm -lpthread -ldl -Wl,-Bstatic -lclangStaticAnalyzerFrontend -lclangStaticAnalyzerCheckers -lclangStaticAnalyzerCore -lclangAnalysis -lclangEdit -lclangAST -lclangParse -lclangLex -lclangBasic -Wl,-Bdynamic -lLLVM-3.9 -lrt -ldl -ltinfo -lpthread -lz -lm -lpthread -ldl

The important part is the list of clang libraries linked, which are all static libs. Checking those libs with "strings" for "enable-value-profiling" reveals that the option is defined in libclangCodeGen.a.

Now, if you load two opencl implementations (ocl-icd loads all available ones) which both linked libclangCodeGen.a, you get the error as I described in my inital post.

I really don't see how the vendors could fix this, other than not linking libclangCodeGen.a. But if you have suggestions I'd be happy to open bugs with them.

AnastasiaStulova commented 7 years ago

At the moment there is not enough information to understand the problem. It is also not clear whether it is a problem with the compiler or the use of it. Could you provide more details what you do? It might though be easier to address it with the vendors/toochains you use instead.

llvmbot commented 7 years ago

Indeed, there is no actual clang command line invocation here. Afaics all opencl libs I tested link a static clang library and that is where the problem comes from. Not sure which one it is though.

AnastasiaStulova commented 7 years ago

I am guessing you don't actually invoke clang from the command line here? Or if yes, it might be useful to see the command line that triggers this error.

I am wondering whether you should address this request with the vendors first.

llvmbot commented 1 month ago

@llvm/issue-subscribers-opencl

Author: None (llvmbot)

| | | | --- | --- | | Bugzilla Link | [30587](https://llvm.org/bz30587) | | Version | trunk | | OS | Linux | | Reporter | LLVM Bugzilla Contributor | | CC | @AnastasiaStulova,@foutrelis,@Oblomov,@jdm,@pjaaskel,@v-fox | ## Extended Description I used llvm+clang 3.9 to build several OpenCL vendor libraries, using ocl-icd as loader. The libs I tested where mesa, beignet and pocl. Any combination of more than one of those libs installed, results in below error when trying to use an OpenCL application (tested with clinfo, https://github.com/Oblomov/clinfo). ============= : CommandLine Error: Option 'enable-value-profiling' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options ============= When building the same libs with llvm+clang 3.8, everything works as expected.
llvmbot commented 1 month ago

@llvm/issue-subscribers-clang-driver

Author: None (llvmbot)

| | | | --- | --- | | Bugzilla Link | [30587](https://llvm.org/bz30587) | | Version | trunk | | OS | Linux | | Reporter | LLVM Bugzilla Contributor | | CC | @AnastasiaStulova,@foutrelis,@Oblomov,@jdm,@pjaaskel,@v-fox | ## Extended Description I used llvm+clang 3.9 to build several OpenCL vendor libraries, using ocl-icd as loader. The libs I tested where mesa, beignet and pocl. Any combination of more than one of those libs installed, results in below error when trying to use an OpenCL application (tested with clinfo, https://github.com/Oblomov/clinfo). ============= : CommandLine Error: Option 'enable-value-profiling' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options ============= When building the same libs with llvm+clang 3.8, everything works as expected.