ROCm / ROCm-OpenCL-Driver

ROCm OpenCL Compiler Tool Driver
MIT License
24 stars 9 forks source link

OpenCL compilation failed with kernels that call get_global_id(). #58

Open behindthepixels opened 6 years ago

behindthepixels commented 6 years ago

Hi,

I synced this repo along with RadeonOpenCompute/llvm, RadeonOpenCompute/clang and RadeonOpenCompute/lld, and built clang as well as this compiler, hoping that I can use it to compile opencl source code to executable so I can load it into my program directly using clCreateProgramWithBinary.

I need to do this because the compiler comes with AMD APP SDK doesn't provide way to use cross lane instructions like swizzle. And I found that the llvm based compiler here supports it with intrinsics.

However, this compiler simply isn't working for me. I maually added the following options to the roc-cl project: options.emplace_back("-target amdgcn-amd-amdhsa-opencl"); options.emplace_back("-cl-std=CL2.0"); options.emplace_back("-Xclang"); options.emplace_back("-finclude-default-header");

However, I tried compiling a simple kernel like the following: __kernel void test(__global uint* output) { uint thread = get_global_id(0); output[thread] = 0; }

And the compilation failed with the following error message: D:\Coding\CoolProjects\llvm_rocm\build\RelWithDebInfo\bin\ld.lld: error: can't create dynamic relocation R_AMDGPU_REL32_LO against symbol: get_global_id(unsigned int) in readonly segment; recompile object files with -fPIC

defined in C:\Users\edliu\AppData\Local\Temp\CLMiner_kernel-a5bb02.o referenced by C:\Users\edliu\AppData\Local\Temp\CLMiner_kernel-a5bb02.o:(test)

D:\Coding\CoolProjects\llvm_rocm\build\RelWithDebInfo\bin\ld.lld: error: can't create dynamic relocation R_AMDGPU_REL32_HI against symbol: get_global_id(unsigned int) in readonly segment; recompile object files with -fPIC

defined in C:\Users\edliu\AppData\Local\Temp\CLMiner_kernel-a5bb02.o referenced by C:\Users\edliu\AppData\Local\Temp\CLMiner_kernel-a5bb02.o:(test) clang.exe: error: ld.lld command failed with exit code 1 (use -v to see invocation)

The error seems to be caused by get_global_id(). Without calling it, there will be no error.

Is this a known issue? Thanks.

behindthepixels commented 6 years ago

Directly calling the clang built from RadeonOpenCompute/clang with the same command options also has the same error.

So are these compilers in a usable state? I have tried using the amd-common branch as well as the roc-1.7x branch.

yxsamliu commented 6 years ago

You need to link with ROCm-Device-Libs which implements the OpenCL builtin functions e.g. get_global_id(). You can use clone clang-ocl for the compiling/linking. It is a script.

behindthepixels commented 6 years ago

I see, thanks! If linking with ROCm-Device-Libs is required, why should I be able to build it without linking it.. And for clang-ocl, what tool should I use to run the script? @yxsamliu

And which branch should I use in general? I saw the default is amd-common, but is it always guaranteed to be runnable given its high update frequency?

behindthepixels commented 6 years ago

I am having some trouble building ROCm-Device-Libs, filed an issue here: https://github.com/RadeonOpenCompute/ROCm-Device-Libs/issues/54

emankov commented 6 years ago

If linking with ROCm-Device-Libs is required, why should I be able to build it without linking it..

It is not required.

yxsamliu commented 6 years ago

If the kernel does not call builtin functions implemented by device lib, then it can be compiled to ISA without linking device lib. Otherwise, it will end up with missing symbols or undefined functions. clang-ocl is bash script. amd-common usually should work since it goes through comprehensive regression tests.