Open AuroraPerego opened 5 months ago
unresolved external symbol puts at offset 356 in instructions segment #0
This error message is quite clear: Apparently the kernel calls into puts()
from the C standard library which is not supported on device. I can't say why puts
ends up in kernel code without seeing your code, but my guess is that you are using cout
or similar.
EDIT: Missed your code snippet. Using printf
inside kernels is illegal in SYCL.
@fodinabor - probably host JIT does not take into account --gcc-install-dir
given at compile time, right? Do we even still need -lstdc++
? libc should be enough at JIT time, right?
EDIT: Missed your code snippet. Using
printf
inside kernels is illegal in SYCL.
Thanks, without the printf
everything works apart from the host device which continues to fail.
There is another related reason for which I've opened this issue which is a failure in compiling kernels with any_of_group
/shift_group_[left/right]
due to a JIT session error: Symbols not found
.
As an example, this:
#include <sycl/sycl.hpp>
void kernel(sycl::nd_item<1> item){
bool more = true;
while ( (item.barrier(), sycl::any_of_group(item.get_group(), more)) ) {
more = false;
}
item.barrier();
}
int main(int argc, char** argv){
int threadsPerBlock = 64;
int blocks = 1;
static const std::vector<sycl::device> devices = sycl::device::get_devices(sycl::info::device_type::all);
if (argc < 2) {
std::cout << "Provide the device number N: ./any_of_group N" << std::endl;
return 1;
}
auto stream = sycl::queue{devices[atoi(argv[1])]};
std::cerr << "stream offload to " << stream.get_device().get_info<sycl::info::device::name>() << " ["
<< stream.get_device().get_info<sycl::info::device::driver_version>() << "]" << std::endl;
stream.submit([&](sycl::handler &cgh) {
cgh.parallel_for(
sycl::nd_range<1>( blocks * threadsPerBlock, threadsPerBlock),
[=](sycl::nd_item<1> item) {
kernel(item);
});
});
}
fails with the error (Intel GPU - OpenCL, the other backends just say that the kernel could not be constructed):
[AdaptiveCpp Error] from /data/user/aperego/AdaptiveCpp/src/runtime/ocl/ocl_code_object.cpp:90 @ ocl_executable_object(): ocl_code_object: Building CL program failed. Build log: error : unresolved external symbol _ZN7hipsycl4sycl6detail13sscp_builtins22__hipsycl_any_of_groupILi1EEEbNS0_5groupIXT_EEEb at offset 436 in instructions segment #0 (aka kernel : _Z21__hipsycl_sscp_kernelIN7hipsycl4glue15__sscp_dispatch20ndrange_parallel_forIZZ4mainENK3$_0clERNS0_4sycl7handlerEEUlNS5_7nd_itemILi1EEEE_Li1EEEEvRKT_)
error : unresolved external symbol _ZN7hipsycl4sycl6detail13sscp_builtins22__hipsycl_any_of_groupILi1EEEbNS0_5groupIXT_EEEb at offset 516 in instructions segment #0 (aka kernel : _Z21__hipsycl_sscp_kernelIN7hipsycl4glue15__sscp_dispatch20ndrange_parallel_forIZZ4mainENK3$_0clERNS0_4sycl7handlerEEUlNS5_7nd_itemILi1EEEE_Li1EEEEvRKT_)
(error code = CL:-42)
Thanks, without the printf everything works apart from the host device which continues to fail. There is another related reason for which I've opened this issue which is a failure in compiling kernels with any_of_group/shiftgroup[left/right] due to a JIT session error: Symbols not found.
This is expected; group algorithms other than group_barrier
are not yet implemented for the generic JIT compiler.
AdaptiveCpp assumes the existence of a correctly configured clang, e.g by using clang configuration files and building AdaptiveCpp against a wrapper clang that automatically correctly sets necessary flags as described here: https://github.com/AdaptiveCpp/AdaptiveCpp/blob/develop/doc/install-llvm.md#gcc-toolchainc-standard-library-is-in-a-non-standard-location
acpp --gcc-install-dir
is probably not the correct way to handle these things because --gcc-install-dir
will not be available at JIT time.
I strongly suspect that a correct clang installation with clang configuration files will resolve the remaining issue.
Bug summary A simple application compiles but fails to run with the JIT compiler. AOT compilation with CUDA results in a working executable. I'm afraid I've missed something when building acpp, but I can't figure out what's wrong. The CPU through OpenCL works, while the CUDA backend, the Intel GPU (with both OpenCL and Level Zero), and the host device fail.
CUDA output:
Intel GPU (OpenCL):
Host:
In this case, it cannot find
crtbeginS.o
and-lstdc++
(at least the first is in the path set with the--gcc-install-dir
flag). To Reproducetest.cpp
:compile and run:
Expected behavior It should run without errors
Describe your setup
develop@57773099 + PR 1031
and built with:gcc (GCC) 11.4.1 20230601
Optional additional diagnostic information
syclcc --hipsycl-version
I see: