Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Target x86_64 RTL --> Target library loading error: undefined symbol with global constexpr T[] #45498

Open Quuxplusone opened 4 years ago

Quuxplusone commented 4 years ago
Bugzilla Link PR46530
Status NEW
Importance P normal
Reported by Jeffrey Kelling (gcc.j.kelling@hzdr.de)
Reported on 2020-07-01 09:05:53 -0700
Last modified on 2021-03-29 08:26:58 -0700
Version unspecified
Hardware PC Linux
CC a.bataev@hotmail.com, jdoerfert@anl.gov, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments helloWorld.ii.gz (703008 bytes, application/gzip)
Blocks
Blocked by
See also

Created attachment 23668 Preprocessed code, gz

Problem description:

When a global constexpr array is accessed through member operator/function under some condition which are not yet clear to me, a symbol for the global array is reported missing although no such symbol should generated.

Example

In the attached code the following key point happen:

Expected result

Running the code, the last line on stdout should be: -1.000000 -1.000000 (there will be one of these per thread and some junk before)

Actual result

exit(1) is called with the error message:

Libomptarget fatal error 1: failure of target construct while offloading is mandatory

using libomptarget build with debug enabled and setting LIBOMPTARGET_DEBUG=1 yields the following error message:

Target x86_64 RTL --> Target library loading error: /tmp/tmpfile_SsIv8d: undefined symbol: _ZN27pmacc_static_const_storage031pmacc_static_const_vector_host0L29DriftParamIons_direction_dataE

The missing symbol reads

pmacc_static_const_storage0::pmacc_static_const_vector_host0::DriftParamIons_direction_data

which is the constexpr defined at line 101181 .

Example Code

Attached is the preprocessed code described above, I could not find a small reproducer by only implementing the properties outlined above. Running clang++ with -E appears to have duplicated code, which I guess is for offload? The code in lines 101181ff reappears at 202434ff.

The attached code can be compiled with:

clang++  -fopenmp -fopenmp=libomp -fopenmp-targets=x86_64-pc-linux-gnu    -fopenmp=libomp helloWorld.ii -lomp -o helloWorld

and ran using

./helloWorld

.

Additional Observations

Applying the following diff:

101201c101201
<             , stor[2]
---
>             , -1.
202454c202454
<             , stor[2]
---
>             , -1.

removes the error and produces the expected result. Note, that this only removes the indirect access to DriftParamIons_direction_data but keeps the direct access, so the global constexpr array is not the issue.

Quuxplusone commented 4 years ago

Attached helloWorld.ii.gz (703008 bytes, application/gzip): Preprocessed code, gz

Quuxplusone commented 4 years ago

Could you provide the non-preprocessed code too? Even if it is not all of it, the declaration and the illigel use of the constexpr global would be sufficient.

Quuxplusone commented 4 years ago

The full code is here: https://github.com/jkelling/alpaka/tree/clang46530

(if you clone this note, that you need to checkout the branch clang46530)

You can configure the build using cmake like this:

mkdir build
cd build
cmake ..  -DOpenMP_CXX_VERSION=5 \
        -DALPAKA_ACC_ANY_BT_OMP5_ENABLE=on \
        -DALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLE=off \
        -DALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLE=off \
        -DALPAKA_ACC_CPU_B_SEQ_T_FIBERS_ENABLE=off \
        -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLE=on \
        -DALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLE=off \
        -DALPAKA_ACC_CPU_B_TBB_T_SEQ_ENABLE=off \
        -DALPAKA_OFFLOAD_MAX_BLOCK_SIZE=1 \
        -DALPAKA_ACC_GPU_CUDA_ENABLE=off \
        -DALPAKA_ACC_GPU_HIP_ENABLE=off \
        -DBUILD_TESTING=off -Dalpaka_BUILD_EXAMPLES=on \
        -DCMAKE_CXX_FLAGS="-fopenmp -fopenmp=libomp -fopenmp-targets=x86_64-pc-linux-gnu" \
        -DCMAKE_EXE_LINKER_FLAGS="-fopenmp"

On my system, I also have to add a -L[path to by cland install]/lib to -DCMAKE_EXE_LINKER_FLAGS otherwise it will link the wrong libomp which makes the binary crash before main(), I guess the cmake code needs to be fixed to do the OMP5 build correctly.

The target with the problem is "helloWorld". https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/helloWorld.cpp

The constexpr is "declared" here: https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/helloWorld.cpp#L35 but this is behind a few layers of macros.

Quuxplusone commented 4 years ago
The access to the global consexpr which works is just a bit further down, here:
https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/helloWorld.cpp#L54
The next line is the access to the same element which triggers the error:
https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/helloWorld.cpp#L55
Quuxplusone commented 4 years ago

I don't get this.

Where is the stor data defined? The way I see this, constexpr ConstArrayStorage<double, 3> stor; only defines a "view" and not an array. The operator[] of that view accesses some array that has to be defined elsewhere (via return PMACC_JOIN(Name,_data)[idx];).

CONST_VECTOR( float_X, 3, DriftParamIons_direction, 0.0, 0.0, -1.0 ); defines the vector for DriftParamIons_direction, which doesn't cause a problem apparently. Are you missing something like CONST_VECTOR( double, 3, stor, 0.0, 0.0, -1.0 );

?

Quuxplusone commented 3 years ago

(In reply to Johannes Doerfert from comment #4)

Where is the stor data defined? It is DriftParamIons_direction_data itself, as used in https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/helloWorld.cpp#L55 . This is created by the macro invocation in https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/helloWorld.cpp#L35 .

The way I see this, constexpr ConstArrayStorage<double, 3> stor; only defines a "view" and not an array. The operator[] of that view accesses some array that has to be defined elsewhere (via return PMACC_JOIN(Name,_data)[idx];).

CONST_VECTOR( float_X, 3, DriftParamIons_direction, 0.0, 0.0, -1.0 ); defines the vector for DriftParamIons_direction, which doesn't cause a problem apparently. Are you missing something like CONST_VECTOR( double, 3, stor, 0.0, 0.0, -1.0 ); No. stor is just a local name, but it is correct, that this is just a view. The type of stor is ConstArrayStorage<double, 3>, which is actually pmacc_static_const_storage0::ConstArrayStorage<double, 3>. The the macro invoked in helloWorld.cpp:35 creates this namespace and the view type as well as the constexpr _data array inside it.

Note, that there is a using namespace ... in https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/include/pmacc/math/ConstVector.hpp#L88 which makes it so that helloWorld.cpp:51 (and helloWorld.cpp:54) even compile. If there were more than one invocation of CONST_VECTOR, the declaration constexpr ConstArrayStorage<double, 3> stor; would be ambiguous.

I pushed a change to give the fully qualified namespaces in https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/helloWorld.cpp#L51 and https://github.com/jkelling/alpaka/blob/clang46530/example/helloWorld/src/helloWorld.cpp#L54 to hopefully remove the confusion.

Both line 54 and 55 access the same constant. Only, line 55 is going through an inline constexpr operator[] .

I hope this helps.

Quuxplusone commented 3 years ago

Try to mark this constexpr variable as declare target. Compiler not always optimizes such variables in the frontend and it may try to emit it. But since this variable is not marked as declare target, it won't be emitted for the device and you might end up with the unresolved link.

Quuxplusone commented 3 years ago

Declaring it as declare target works.

Not sure whether this explicitness is mandated by the standard or if this is still a bug.

Quuxplusone commented 3 years ago

(In reply to Jeffrey Kelling from comment #7)

Declaring it as declare target works.

Not sure whether this explicitness is mandated by the standard or if this is still a bug.

I don't remember a special requirement about this in the standard. Of course, we can handle it in a different way but currently the standard does not distinguish constexpr/regular global vars, AFAIK

Quuxplusone commented 3 years ago

Arguably our diagnostic is not great. We should either warn/error or do the right thing.