CHIP-SPV / chipStar

chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.
Other
219 stars 32 forks source link

HIP-SYCL Interop failure using MKL 2024 #743

Closed pvelesko closed 7 months ago

pvelesko commented 9 months ago

2023.2.1 works, 2024 does not.

#0  __GI__dl_debug_state () at ./elf/dl-debug.c:116
116 ./elf/dl-debug.c: No such file or directory.
(gdb) bt
#0  __GI__dl_debug_state () at ./elf/dl-debug.c:116
#1  0x00007ffff7fccadd in _dl_map_object_from_fd (name=name@entry=0x7fffffffa760 "libigc.so.1", origname=origname@entry=0x0, fd=-1, fbp=<optimized out>, realname=<optimized out>, loader=loader@entry=0x0, l_type=<optimized out>,
    mode=<optimized out>, stack_endp=<optimized out>, nsid=<optimized out>) at ./elf/dl-load.c:1501
#2  0x00007ffff7fcd601 in _dl_map_object (loader=0x0, loader@entry=0x555556bc7aa0, name=name@entry=0x7fffffffa760 "libigc.so.1", type=type@entry=2, trace_mode=trace_mode@entry=0, mode=mode@entry=-1879048183, nsid=<optimized out>)
    at ./elf/dl-load.c:2327
#3  0x00007ffff7fd19a9 in dl_open_worker_begin (a=a@entry=0x7fffffffa400) at ./elf/dl-open.c:534
#4  0x00007fffc9d748a8 in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at ./elf/dl-error-skeleton.c:208
#5  0x00007ffff7fd0f9a in dl_open_worker (a=a@entry=0x7fffffffa400) at ./elf/dl-open.c:782
#6  0x00007fffc9d748a8 in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at ./elf/dl-error-skeleton.c:208
#7  0x00007ffff7fd134e in _dl_open (file=<optimized out>, mode=-2147483639,
    caller_dlopen=0x7fffc8678d75 <NEO::Linux::OsLibrary::OsLibrary(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*)+133>,
    nsid=-2, argc=1, argv=<optimized out>, env=0x7fffffffc838) at ./elf/dl-open.c:883
#8  0x00007fffc9c9063c in dlopen_doit (a=a@entry=0x7fffffffa670) at ./dlfcn/dlopen.c:56
#9  0x00007fffc9d748a8 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffa5d0, operate=<optimized out>, args=<optimized out>) at ./elf/dl-error-skeleton.c:208
#10 0x00007fffc9d74973 in __GI__dl_catch_error (objname=0x7fffffffa628, errstring=0x7fffffffa630, mallocedp=0x7fffffffa627, operate=<optimized out>, args=<optimized out>) at ./elf/dl-error-skeleton.c:227
#11 0x00007fffc9c9012e in _dlerror_run (operate=operate@entry=0x7fffc9c905e0 <dlopen_doit>, args=args@entry=0x7fffffffa670) at ./dlfcn/dlerror.c:138
#12 0x00007fffc9c906c8 in dlopen_implementation (dl_caller=<optimized out>, mode=<optimized out>, file=<optimized out>) at ./dlfcn/dlopen.c:71
#13 ___dlopen (file=<optimized out>, mode=<optimized out>) at ./dlfcn/dlopen.c:81
#14 0x00007fffc8678d75 in NEO::Linux::OsLibrary::OsLibrary(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) ()
   from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#15 0x00007fffc8678de5 in NEO::OsLibrary::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) ()
   from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#16 0x00007fffc84efb36 in bool NEO::loadCompiler<IGC::IgcOclDeviceCtx>(char const*, std::unique_ptr<NEO::OsLibrary, std::default_delete<NEO::OsLibrary> >&, std::unique_ptr<CIF::CIFMain, CIF::RAII::ReleaseHelper<CIF::CIFMain> >&) ()
   from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#17 0x00007fffc84ebfba in NEO::CompilerInterface::loadIgc() () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#18 0x00007fffc84ec154 in NEO::CompilerInterface::initialize(std::unique_ptr<NEO::CompilerCache, std::default_delete<NEO::CompilerCache> >&&, bool) ()
   from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#19 0x00007fffc85a34e7 in NEO::RootDeviceEnvironment::getCompilerInterface() () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#20 0x00007fffc859a531 in NEO::Device::getMaxParameterSizeFromIGC() const () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#21 0x00007fffc859ac51 in NEO::Device::initializeCaps() () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#22 0x00007fffc859831a in NEO::Device::createDeviceImpl() () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#23 0x00007fffc85caa1a in NEO::DeviceFactory::{lambda(NEO::ExecutionEnvironment&, unsigned int)#1}::_FUN(NEO::ExecutionEnvironment&, unsigned int) ()
   from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#24 0x00007fffc85cbc0e in NEO::DeviceFactory::createDevices(NEO::ExecutionEnvironment&) () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#25 0x00007fffc80f80e6 in L0::DriverImp::initialize(_ze_result_t*) () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#26 0x00007fffc80f7a0f in std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<L0::DriverImp::driverInit(unsigned int)::{lambda()#1}>(std::once_flag&, L0::DriverImp::driverInit(unsigned int)::{lambda()#1}&&)::{lambda()#1}>(L0::DriverImp::driverInit(unsigned int)::{lambda()#1}&)::{lambda()#1}::_FUN() () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#27 0x00007fffc9c99ee8 in __pthread_once_slow (once_control=0x7fffc9b692c8 <L0::driverImp+8>, init_routine=0x7fffca0dad50 <__once_proxy>) at ./nptl/pthread_once.c:116
#28 0x00007fffc80f7c98 in L0::init(unsigned int) () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#29 0x00007fffd164fbc1 in _ze_result_t tracing_layer::APITracerWrapperImp<_ze_result_t (*)(unsigned int), _ze_init_params_t*, void (*)(_ze_init_params_t*, _ze_result_t, void*, void**), std::vector<tracing_layer::APITracerCallbackStateImp<void (*)(_ze_init_params_t*, _ze_result_t, void*, void**)>, std::allocator<tracing_layer::APITracerCallbackStateImp<void (*)(_ze_init_params_t*, _ze_result_t, void*, void**)> > >, std::vector<tracing_layer::APITracerCallbackStateImp<void (*)(_ze_init_params_t*, _ze_result_t, void*, void**)>, std::allocator<tracing_layer::APITracerCallbackStateImp<void (*)(_ze_init_params_t*, _ze_result_t, void*, void**)> > >, unsigned int&>(_ze_result_t (*)(unsigned int), _ze_init_params_t*, void (*)(_ze_init_params_t*, _ze_result_t, void*, void**), std::vector<tracing_layer::APITracerCallbackStateImp<void (*)(_ze_init_params_t*, _ze_result_t, void*, void**)>, std::allocator<tracing_layer::APITracerCallbackStateImp<void (*)(_ze_init_params_t*, _ze_result_t, void*, void**)> > >, std::vector<tracing_layer::APITracerCallbackStateImp<void (*)(_ze_init_params_t*, _ze_result_t, void*, void**)>, std::allocator<tracing_layer::APITracerCallbackStateImp<void (*)(_ze_init_params_t*, _ze_result_t, void*, void**)> > >, unsigned int&) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_tracing_layer.so.1
#30 0x00007fffd162287f in tracing_layer::zeInit(unsigned int) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_tracing_layer.so.1
#31 0x00007ffff5d217e5 in loader::context_t::init_driver(loader::driver_t, unsigned int) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#32 0x00007ffff5d20a57 in loader::context_t::check_drivers(unsigned int) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#33 0x00007ffff5d26afd in zelLoaderDriverCheck () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#34 0x00007ffff5d19e9f in ze_lib::context_t::Init(unsigned int, bool) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#35 0x00007ffff5d0b1ab in zeInit::{lambda()#1}::operator()() const () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#36 0x00007ffff5d11e7d in _ze_result_t std::__invoke_impl<_ze_result_t, zeInit::{lambda()#1}>(std::__invoke_other, zeInit::{lambda()#1}&&) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#37 0x00007ffff5d11d8c in std::__invoke_result<zeInit::{lambda()#1}>::type std::__invoke<zeInit::{lambda()#1}>(zeInit::{lambda()#1}&&) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#38 0x00007ffff5d11cad in std::call_once<zeInit::{lambda()#1}>(std::once_flag&, zeInit::{lambda()#1}&&)::{lambda()#1}::operator()() const () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#39 0x00007ffff5d11db5 in std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<zeInit::{lambda()#1}>(std::once_flag&, zeInit::{lambda()#1}&&)::{lambda()#1}>(zeInit::{lambda()#1}&)::{lambda()#1}::operator()() const ()
   from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#40 0x00007ffff5d11dca in std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<zeInit::{lambda()#1}>(std::once_flag&, zeInit::{lambda()#1}&&)::{lambda()#1}>(zeInit::{lambda()#1}&)::{lambda()#1}::_FUN() ()
   from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#41 0x00007fffc9c99ee8 in __pthread_once_slow (once_control=0x555556bc2480, init_routine=0x7fffca0dad50 <__once_proxy>) at ./nptl/pthread_once.c:116
#42 0x00007ffff5d0b176 in __gthread_once(int*, void (*)()) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#43 0x00007ffff5d11d05 in void std::call_once<zeInit::{lambda()#1}>(std::once_flag&, zeInit::{lambda()#1}&&) () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#44 0x00007ffff5d0b28e in zeInit () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#45 0x00007ffff6790cf5 in CHIPBackendLevel0::initializeImpl (this=0x555556bc6090) at /home/pvelesko/chipStar/src/backend/Level0/CHIPBackendLevel0.cc:1767
#46 0x00007ffff66e6f54 in chipstar::Backend::initialize (this=0x0) at /home/pvelesko/chipStar/src/CHIPBackend.cc:1239
--Type <RET> for more, q to quit, c to continue without paging-- c
#47 0x00007fffc9c99ee8 in __pthread_once_slow (once_control=0x7ffff67ff440 <Initialized>, init_routine=0x7fffca0dad50 <__once_proxy>) at ./nptl/pthread_once.c:116
#48 0x00007ffff66d770d in __gthread_once (__once=0x0, __func=0x1) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/x86_64-linux-gnu/c++/11/bits/gthr-default.h:700
#49 std::call_once<void (*)()> (__once=..., __f=@0x7fffffffc5d0: 0x7ffff66d73f0 <CHIPInitializeCallOnce()>) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/mutex:783
#50 CHIPInitialize () at /home/pvelesko/chipStar/src/CHIPDriver.cc:124
#51 0x00007ffff673a87e in __hipRegisterFatBinary (Data=0x55555555a0c8 <__hip_fatbin_wrapper>) at /home/pvelesko/chipStar/src/CHIPBindings.cc:4417
#52 0x00005555555557a7 in __hip_module_ctor ()
#53 0x00007fffc9c29ebb in call_init (env=<optimized out>, argv=0x7fffffffc828, argc=1) at ../csu/libc-start.c:145
#54 __libc_start_main_impl (main=0x5555555554c0 <main()>, argc=1, argv=0x7fffffffc828, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffc818) at ../csu/libc-start.c:379
#55 0x00005555555551d5 in _start ()
╭─pvelesko@meatloaf ~/chipStar/build ‹main●›
╰─$ ldd /home/pvelesko/chipStar/build/samples/hip_sycl_interop_no_buffers/hip_sycl_interop_no_buffers
    linux-vdso.so.1 (0x00007ffe8b7f2000)
    libsvml.so => /opt/intel/oneapi/compiler/2024.0/lib/libsvml.so (0x00007f560d000000)
    libintlc.so.5 => /opt/intel/oneapi/compiler/2024.0/lib/libintlc.so.5 (0x00007f560e63c000)
    libirng.so => /opt/intel/oneapi/compiler/2024.0/lib/libirng.so (0x00007f560cf06000)
    libimf.so => /opt/intel/oneapi/compiler/2024.0/lib/libimf.so (0x00007f560ca00000)
    libsycl.so.7 => /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7 (0x00007f560c600000)
    libonemkl_gemm_wrapper_no_buffers.so => /home/pvelesko/chipStar/build/samples/hip_sycl_interop_no_buffers/onemkl_gemm_wrapper_no_buffers/libonemkl_gemm_wrapper_no_buffers.so (0x00007f560cefe000)
    libCHIP.so => /home/pvelesko/chipStar/build/libCHIP.so (0x00007f560c43f000)
    libze_loader.so.1 => /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1 (0x00007f560c322000)
    libOpenCL.so.1 => /space/pvelesko/install/ocl-icd-loder/lib/libOpenCL.so.1 (0x00007f560cee9000)
    libmkl_sycl_blas.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4 (0x00007f5606c00000)
    libmkl_sycl_lapack.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_lapack.so.4 (0x00007f5604400000)
    libmkl_sycl_sparse.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_sparse.so.4 (0x00007f55fde00000)
    libmkl_sycl_dft.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_dft.so.4 (0x00007f55fac00000)
    libmkl_sycl_vm.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_vm.so.4 (0x00007f55f0e00000)
    libmkl_sycl_rng.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_rng.so.4 (0x00007f55e9e00000)
    libmkl_sycl_stats.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_stats.so.4 (0x00007f55e7e00000)
    libmkl_sycl_data_fitting.so.4 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_data_fitting.so.4 (0x00007f55e7400000)
    libmkl_intel_ilp64.so.2 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_intel_ilp64.so.2 (0x00007f55e6200000)
    libmkl_sequential.so.2 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sequential.so.2 (0x00007f55e4c00000)
    libmkl_core.so.2 => /opt/intel/oneapi/mkl/2024.0/lib/libmkl_core.so.2 (0x00007f55e0a00000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f55e0600000)
    libm.so.6 => /usr/lib/x86_64-linux-gnu/libm.so.6 (0x00007f560c23b000)
    libgcc_s.so.1 => /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f560cec9000)
    libc.so.6 => /usr/lib/x86_64-linux-gnu/libc.so.6 (0x00007f55e0200000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f560e6a6000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f560ceb1000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f560ceac000)
╭─pvelesko@meatloaf ~/chipStar/build ‹main●›
╰─$
╭─pvelesko@meatloaf ~/chipStar/build ‹main●›
╰─$  /home/pvelesko/chipStar/build/samples/hip_sycl_interop_no_buffers/hip_sycl_interop_no_buffers
[1]    985296 segmentation fault (core dumped)
╭─pvelesko@meatloaf ~/chipStar/build ‹main●›
╰─$                                                                                                                                                                                                                                            139 ↵
╭─pvelesko@meatloaf ~/chipStar/build ‹main●›
╰─$                                                                                                                                                                                                                                            139 ↵
╭─pvelesko@meatloaf ~/chipStar/build ‹main●›
╰─$ ml                                                                                                                                                                                                                                         130 ↵
Currently Loaded Modulefiles:
 1) llvm/17.0-unpatched-spirv   3) intel-compute-runtime/igc/igc-1.0.15770.5   5) intel-compute-runtime/level-zero/23.43.27642.21   7) level-zero/dgpu   9) compiler-rt/2024.0.2  11) compiler/2024.0.2
 2) opencl/ocl-icd-loader       4) intel-compute-runtime/neo/23.43.27642.21    6) intel-compute-runtime/latest                      8) tbb/latest       10) oclfpga/2024.0.0      12) mkl/latest

Key:
auto-loaded
Sarbojit2019 commented 8 months ago

@pvelesko, In the call stack I don't see any interop related calls. Are you sure it is happening only with interop samples? I am wondering if it is some other oneAPI's components issue.

pvelesko commented 8 months ago

zeInit? @Sarbojit2019

Are you sure it is happening only with interop samples?

yes

Sarbojit2019 commented 8 months ago

zeInit will be called for any sample running on l-zero backend it is not interop specific.

pvelesko commented 8 months ago

yeah that's true I thought the call stack included hipInitFromNative. I'll double check

pvelesko commented 8 months ago
Thread 1 "hip_sycl_intero" received signal SIGSEGV, Segmentation fault.
0x00007fffc884dda8 in typeinfo for L0::KernelImp () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
(gdb) bt
#0  0x00007fffc884dda8 in typeinfo for L0::KernelImp () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#1  0x00007fffc80cfc29 in L0::zeCommandListAppendMemoryCopy(_ze_command_list_handle_t*, void*, void const*, unsigned long, _ze_event_handle_t*, unsigned int, _ze_event_handle_t**) () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#2  0x00007ffff5d0cd03 in zeCommandListAppendMemoryCopy () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#3  0x00007fffc44b2bc4 in enqueueMemCopyHelper(ur_command_t, ur_queue_handle_t_*, void*, unsigned char, unsigned long, void const*, unsigned int, ur_event_handle_t_* const*, ur_event_handle_t_**, bool) () from /opt/intel/oneapi/compiler/2024.0/lib/libpi_level_zero.so
#4  0x00007fffc44b3767 in urEnqueueMemBufferWrite () from /opt/intel/oneapi/compiler/2024.0/lib/libpi_level_zero.so
#5  0x00007fffc44db09c in piEnqueueMemBufferWrite () from /opt/intel/oneapi/compiler/2024.0/lib/libpi_level_zero.so
#6  0x00007ffff603c4e5 in sycl::_V1::detail::copyH2D(sycl::_V1::detail::SYCLMemObjI*, char*, std::shared_ptr<sycl::_V1::detail::queue_impl>, unsigned int, sycl::_V1::range<3>, sycl::_V1::range<3>, sycl::_V1::id<3>, unsigned int, _pi_mem*, std::shared_ptr<sycl::_V1::detail::queue_impl>, unsigned int, sycl::_V1::range<3>, sycl::_V1::range<3>, sycl::_V1::id<3>, unsigned int, std::vector<_pi_event*, std::allocator<_pi_event*> >, _pi_event*&, std::shared_ptr<sycl::_V1::detail::event_impl> const&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#7  0x00007ffff603df82 in sycl::_V1::detail::MemoryManager::copy(sycl::_V1::detail::SYCLMemObjI*, void*, std::shared_ptr<sycl::_V1::detail::queue_impl>, unsigned int, sycl::_V1::range<3>, sycl::_V1::range<3>, sycl::_V1::id<3>, unsigned int, void*, std::shared_ptr<sycl::_V1::detail::queue_impl>, unsigned int, sycl::_V1::range<3>, sycl::_V1::range<3>, sycl::_V1::id<3>, unsigned int, std::vector<_pi_event*, std::allocator<_pi_event*> >, _pi_event*&, std::shared_ptr<sycl::_V1::detail::event_impl> const&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#8  0x00007ffff60bbb91 in sycl::_V1::detail::MemCpyCommand::enqueueImp() () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#9  0x00007ffff60b2f00 in sycl::_V1::detail::Command::enqueue(sycl::_V1::detail::EnqueueResultT&, sycl::_V1::detail::BlockingT, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#10 0x00007ffff60dca14 in sycl::_V1::detail::Scheduler::GraphProcessor::enqueueCommand(sycl::_V1::detail::Command*, std::shared_lock<std::shared_timed_mutex>&, sycl::_V1::detail::EnqueueResultT&, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::Command*, sycl::_V1::detail::BlockingT) ()
   from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#11 0x00007ffff60dc9f7 in sycl::_V1::detail::Scheduler::GraphProcessor::enqueueCommand(sycl::_V1::detail::Command*, std::shared_lock<std::shared_timed_mutex>&, sycl::_V1::detail::EnqueueResultT&, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::Command*, sycl::_V1::detail::BlockingT) ()
   from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#12 0x00007ffff60d7c7c in sycl::_V1::detail::Scheduler::enqueueCommandForCG(std::shared_ptr<sycl::_V1::detail::event_impl>, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::BlockingT) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#13 0x00007ffff60d73f8 in sycl::_V1::detail::Scheduler::addCG(std::unique_ptr<sycl::_V1::detail::CG, std::default_delete<sycl::_V1::detail::CG> >, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, _pi_ext_command_buffer*, std::vector<unsigned int, std::allocator<unsigned int> > const&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#14 0x00007ffff610fdb2 in sycl::_V1::handler::finalize() () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#15 0x00007ffff6099b51 in void sycl::_V1::detail::queue_impl::finalizeHandler<sycl::_V1::handler>(sycl::_V1::handler&, sycl::_V1::detail::CG::CGTYPE const&, sycl::_V1::event&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#16 0x00007ffff6099551 in sycl::_V1::detail::queue_impl::submit_impl(std::function<void (sycl::_V1::handler&)> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::code_location const&, std::function<void (bool, bool, sycl::_V1::event&)> const*) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#17 0x00007ffff613d106 in sycl::_V1::detail::queue_impl::submit(std::function<void (sycl::_V1::handler&)> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::code_location const&, std::function<void (bool, bool, sycl::_V1::event&)> const*) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#18 0x00007ffff613d0c5 in sycl::_V1::queue::submit_impl(std::function<void (sycl::_V1::handler&)>, sycl::_V1::detail::code_location const&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#19 0x00007ffff0e42ee0 in oneapi::mkl::gpu::launch_kernel_3D(int*, sycl::_V1::queue*, mkl_gpu_kernel_struct_t*, mkl_gpu_argument_t*, unsigned long*, unsigned long*, mkl_gpu_event_list_t*) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#20 0x00007ffff0e38a11 in oneapi::mkl::gpu::have_binary_kernels(int*, sycl::_V1::queue*) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#21 0x00007ffff26f3aac in oneapi::mkl::gpu::mkl_blas_gpu_sgemm_driver_sycl(int*, sycl::_V1::queue*, oneapi::mkl::gpu::blas_arg_usm_t*, mkl_gpu_event_list_t*) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#22 0x00007ffff26de93b in oneapi::mkl::gpu::sgemm_sycl_internal(sycl::_V1::queue*, MKL_LAYOUT, MKL_TRANSPOSE, MKL_TRANSPOSE, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&, long, long, long) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#23 0x00007ffff26dcf5d in oneapi::mkl::gpu::sgemm_sycl(sycl::_V1::queue*, MKL_LAYOUT, MKL_TRANSPOSE, MKL_TRANSPOSE, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&, long, long, long) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#24 0x00007ffff311cd22 in oneapi::mkl::blas::sgemm(sycl::_V1::queue&, MKL_LAYOUT, oneapi::mkl::transpose, oneapi::mkl::transpose, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#25 0x00007ffff30a4f2f in oneapi::mkl::blas::column_major::gemm(sycl::_V1::queue&, oneapi::mkl::transpose, oneapi::mkl::transpose, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#26 0x00007ffff7e594b0 in oneapi::mkl::blas::column_major::gemm (queue=..., transa=oneapi::mkl::transpose::nontrans, transb=oneapi::mkl::transpose::nontrans, m=93825053422816, n=0, k=140736557538728, alpha=..., a=<optimized out>, lda=<optimized out>, b=<optimized out>, ldb=140736549682192, beta=..., c=<optimized out>, ldc=140736580394208,
    dependencies=std::vector of length 0, capacity 0) at /opt/intel/oneapi/mkl/2024.0/include/oneapi/mkl/blas/usm_decls.hpp:38
#27 onemkl_gemm (my_queue=..., A=<optimized out>, B=<optimized out>, C=<optimized out>, m=<optimized out>, n=<optimized out>, k=10, ldA=10, ldB=10, ldC=10, alpha=<optimized out>, beta=-1848220) at /home/pvelesko/chipStar/samples/hip_sycl_interop_no_buffers/onemkl_gemm_wrapper_no_buffers/onemkl_gemm_wrapper.cpp:63
#28 0x00007ffff7e5a153 in oneMKLGemmTest (nativeHandlers=<optimized out>, hip_backend=<optimized out>, A=<optimized out>, B=<optimized out>, C=<optimized out>, M=<optimized out>, N=<optimized out>, K=<optimized out>, ldA=<optimized out>, ldB=<optimized out>, ldC=<optimized out>, alpha=<optimized out>, beta=<optimized out>)
    at /home/pvelesko/chipStar/samples/hip_sycl_interop_no_buffers/onemkl_gemm_wrapper_no_buffers/onemkl_gemm_wrapper.cpp:122
#29 0x0000555555555716 in main () at /home/pvelesko/chipStar/samples/hip_sycl_interop_no_buffers/hip_sycl_interop.cpp:121

@Sarbojit2019

pvelesko commented 8 months ago
opencl/ocl-icd-loader   2) intel-compute-runtime/igc/igc-1.0.15770.5   3) intel-compute-runtime/neo/23.43.27642.21   4) intel-compute-runtime/level-zero/23.43.27642.21   5) intel-compute-runtime/latest   6) level-zero/dgpu   7) compiler-rt/2023.2.1   8) tbb/latest   9) mkl/2023.2.0  10) llvm/17.0-unpatched-spirv  11) oclfpga/2024.0.0  12) compiler/2024.0.2

Switching the compiler from 2023.2.1 to 2024.0.2 results in this segfault

pvelesko commented 8 months ago

here's the callstack with RCL

Thread 1 "hip_sycl_intero" received signal SIGSEGV, Segmentation fault.
0x00007fffc884dda8 in typeinfo for L0::KernelImp () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
(gdb) bt
#0  0x00007fffc884dda8 in typeinfo for L0::KernelImp () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#1  0x00007fffc80cfc29 in L0::zeCommandListAppendMemoryCopy(_ze_command_list_handle_t*, void*, void const*, unsigned long, _ze_event_handle_t*, unsigned int, _ze_event_handle_t**) () from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#2  0x00007ffff5d0cd03 in zeCommandListAppendMemoryCopy () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#3  0x00007fffc44b2bc4 in enqueueMemCopyHelper(ur_command_t, ur_queue_handle_t_*, void*, unsigned char, unsigned long, void const*, unsigned int, ur_event_handle_t_* const*, ur_event_handle_t_**, bool) () from /opt/intel/oneapi/compiler/2024.0/lib/libpi_level_zero.so
#4  0x00007fffc44b3767 in urEnqueueMemBufferWrite () from /opt/intel/oneapi/compiler/2024.0/lib/libpi_level_zero.so
#5  0x00007fffc44db09c in piEnqueueMemBufferWrite () from /opt/intel/oneapi/compiler/2024.0/lib/libpi_level_zero.so
#6  0x00007ffff603c4e5 in sycl::_V1::detail::copyH2D(sycl::_V1::detail::SYCLMemObjI*, char*, std::shared_ptr<sycl::_V1::detail::queue_impl>, unsigned int, sycl::_V1::range<3>, sycl::_V1::range<3>, sycl::_V1::id<3>, unsigned int, _pi_mem*, std::shared_ptr<sycl::_V1::detail::queue_impl>, unsigned int, sycl::_V1::range<3>, sycl::_V1::range<3>, sycl::_V1::id<3>, unsigned int, std::vector<_pi_event*, std::allocator<_pi_event*> >, _pi_event*&, std::shared_ptr<sycl::_V1::detail::event_impl> const&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#7  0x00007ffff603df82 in sycl::_V1::detail::MemoryManager::copy(sycl::_V1::detail::SYCLMemObjI*, void*, std::shared_ptr<sycl::_V1::detail::queue_impl>, unsigned int, sycl::_V1::range<3>, sycl::_V1::range<3>, sycl::_V1::id<3>, unsigned int, void*, std::shared_ptr<sycl::_V1::detail::queue_impl>, unsigned int, sycl::_V1::range<3>, sycl::_V1::range<3>, sycl::_V1::id<3>, unsigned int, std::vector<_pi_event*, std::allocator<_pi_event*> >, _pi_event*&, std::shared_ptr<sycl::_V1::detail::event_impl> const&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#8  0x00007ffff60bbb91 in sycl::_V1::detail::MemCpyCommand::enqueueImp() () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#9  0x00007ffff60b2f00 in sycl::_V1::detail::Command::enqueue(sycl::_V1::detail::EnqueueResultT&, sycl::_V1::detail::BlockingT, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#10 0x00007ffff60dca14 in sycl::_V1::detail::Scheduler::GraphProcessor::enqueueCommand(sycl::_V1::detail::Command*, std::shared_lock<std::shared_timed_mutex>&, sycl::_V1::detail::EnqueueResultT&, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::Command*, sycl::_V1::detail::BlockingT) ()
   from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#11 0x00007ffff60dc9f7 in sycl::_V1::detail::Scheduler::GraphProcessor::enqueueCommand(sycl::_V1::detail::Command*, std::shared_lock<std::shared_timed_mutex>&, sycl::_V1::detail::EnqueueResultT&, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::Command*, sycl::_V1::detail::BlockingT) ()
   from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#12 0x00007ffff60d7c7c in sycl::_V1::detail::Scheduler::enqueueCommandForCG(std::shared_ptr<sycl::_V1::detail::event_impl>, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::BlockingT) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#13 0x00007ffff60d73f8 in sycl::_V1::detail::Scheduler::addCG(std::unique_ptr<sycl::_V1::detail::CG, std::default_delete<sycl::_V1::detail::CG> >, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, _pi_ext_command_buffer*, std::vector<unsigned int, std::allocator<unsigned int> > const&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#14 0x00007ffff610fdb2 in sycl::_V1::handler::finalize() () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#15 0x00007ffff6099b51 in void sycl::_V1::detail::queue_impl::finalizeHandler<sycl::_V1::handler>(sycl::_V1::handler&, sycl::_V1::detail::CG::CGTYPE const&, sycl::_V1::event&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#16 0x00007ffff6099551 in sycl::_V1::detail::queue_impl::submit_impl(std::function<void (sycl::_V1::handler&)> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::code_location const&, std::function<void (bool, bool, sycl::_V1::event&)> const*) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#17 0x00007ffff613d106 in sycl::_V1::detail::queue_impl::submit(std::function<void (sycl::_V1::handler&)> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::code_location const&, std::function<void (bool, bool, sycl::_V1::event&)> const*) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#18 0x00007ffff613d0c5 in sycl::_V1::queue::submit_impl(std::function<void (sycl::_V1::handler&)>, sycl::_V1::detail::code_location const&) () from /opt/intel/oneapi/compiler/2024.0/lib/libsycl.so.7
#19 0x00007ffff0e42ee0 in oneapi::mkl::gpu::launch_kernel_3D(int*, sycl::_V1::queue*, mkl_gpu_kernel_struct_t*, mkl_gpu_argument_t*, unsigned long*, unsigned long*, mkl_gpu_event_list_t*) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#20 0x00007ffff0e38a11 in oneapi::mkl::gpu::have_binary_kernels(int*, sycl::_V1::queue*) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#21 0x00007ffff26f3aac in oneapi::mkl::gpu::mkl_blas_gpu_sgemm_driver_sycl(int*, sycl::_V1::queue*, oneapi::mkl::gpu::blas_arg_usm_t*, mkl_gpu_event_list_t*) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#22 0x00007ffff26de93b in oneapi::mkl::gpu::sgemm_sycl_internal(sycl::_V1::queue*, MKL_LAYOUT, MKL_TRANSPOSE, MKL_TRANSPOSE, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&, long, long, long) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#23 0x00007ffff26dcf5d in oneapi::mkl::gpu::sgemm_sycl(sycl::_V1::queue*, MKL_LAYOUT, MKL_TRANSPOSE, MKL_TRANSPOSE, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&, long, long, long) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#24 0x00007ffff311cd22 in oneapi::mkl::blas::sgemm(sycl::_V1::queue&, MKL_LAYOUT, oneapi::mkl::transpose, oneapi::mkl::transpose, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#25 0x00007ffff30a4f2f in oneapi::mkl::blas::column_major::gemm(sycl::_V1::queue&, oneapi::mkl::transpose, oneapi::mkl::transpose, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&) () from /opt/intel/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#26 0x00007ffff7e594b0 in oneapi::mkl::blas::column_major::gemm (queue=..., transa=oneapi::mkl::transpose::nontrans, transb=oneapi::mkl::transpose::nontrans, m=93825053557584, n=0, k=140736557538728, alpha=..., a=<optimized out>, lda=<optimized out>, b=<optimized out>, ldb=140736549682192, beta=..., c=<optimized out>, ldc=140736580394208,
    dependencies=std::vector of length 0, capacity 0) at /opt/intel/oneapi/mkl/2024.0/include/oneapi/mkl/blas/usm_decls.hpp:38
#27 onemkl_gemm (my_queue=..., A=<optimized out>, B=<optimized out>, C=<optimized out>, m=<optimized out>, n=<optimized out>, k=10, ldA=10, ldB=10, ldC=10, alpha=<optimized out>, beta=-1848220) at /home/pvelesko/chipStar/samples/hip_sycl_interop_no_buffers/onemkl_gemm_wrapper_no_buffers/onemkl_gemm_wrapper.cpp:63
#28 0x00007ffff7e5a153 in oneMKLGemmTest (nativeHandlers=<optimized out>, hip_backend=<optimized out>, A=<optimized out>, B=<optimized out>, C=<optimized out>, M=<optimized out>, N=<optimized out>, K=<optimized out>, ldA=<optimized out>, ldB=<optimized out>, ldC=<optimized out>, alpha=<optimized out>, beta=<optimized out>)
    at /home/pvelesko/chipStar/samples/hip_sycl_interop_no_buffers/onemkl_gemm_wrapper_no_buffers/onemkl_gemm_wrapper.cpp:122
#29 0x0000555555555716 in main () at /home/pvelesko/chipStar/samples/hip_sycl_interop_no_buffers/hip_sycl_interop.cpp:121
pvelesko commented 8 months ago

level-zero/igpu + MKL 2024 throws pure virtual method called @Sarbojit2019

dgpu_opencl_make_check_result.txt: PASS
igpu_opencl_make_check_result.txt: PASS
igpu_level0_reg_make_check_result.txt: FAIL
    499 - hip_sycl_interop (Subprocess aborted)
    500 - hip_sycl_interop_no_buffers (Subprocess aborted)
dgpu_level0_reg_make_check_result.txt: FAIL
    499 - hip_sycl_interop (SEGFAULT)
    500 - hip_sycl_interop_no_buffers (SEGFAULT)
dgpu_level0_imm_make_check_result.txt: FAIL
    500 - hip_sycl_interop (SEGFAULT)
    501 - hip_sycl_interop_no_buffers (SEGFAULT)
Starting program: /space/pvelesko/chipStar/test-mkl/build/samples/hip_sycl_interop_no_buffers/hip_sycl_interop_no_buffers
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffc8c73640 (LWP 3797765)]
[New Thread 0x7fffb88c0640 (LWP 3797766)]
[New Thread 0x7fffb3fff640 (LWP 3797767)]
[Thread 0x7fffb88c0640 (LWP 3797766) exited]
[Thread 0x7fffb3fff640 (LWP 3797767) exited]
[New Thread 0x7fffb37fe640 (LWP 3797768)]
[New Thread 0x7fffb2ffd640 (LWP 3797769)]
[New Thread 0x7fffb3fff640 (LWP 3797770)]
[Detaching after fork from child process 3797771]
[Detaching after fork from child process 3797772]
[Detaching after fork from child process 3797773]
[Detaching after fork from child process 3797774]
[Detaching after fork from child process 3797775]
[Detaching after fork from child process 3797776]
[Detaching after fork from child process 3797777]
[Detaching after fork from child process 3797778]
[Detaching after fork from child process 3797779]
pure virtual method called
terminate called without an active exception

Thread 1 "hip_sycl_intero" received signal SIGABRT, Aborted.
0x00007fffcacb49fc in pthread_kill () from /usr/lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007fffcacb49fc in pthread_kill () from /usr/lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fffcac60476 in raise () from /usr/lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fffcac467f3 in abort () from /usr/lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fffcaff0b9e in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fffcaffc20c in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007fffcaffc277 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007fffcaffcfa5 in __cxa_pure_virtual () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007fffc8ff64a7 in L0::zeCommandListAppendLaunchKernel(_ze_command_list_handle_t*, _ze_kernel_handle_t*, _ze_group_count_t const*, _ze_event_handle_t*, unsigned int, _ze_event_handle_t**) ()
   from /space/pvelesko/install/intel-compute-runtime//neo/23.43.27642.21/lib/libze_intel_gpu.so.1
#8  0x00007ffff5e01c22 in zeCommandListAppendLaunchKernel () from /space/pvelesko/install/intel-compute-runtime//level-zero/23.43.27642.21/lib/libze_loader.so.1
#9  0x00007fffb2355c44 in urEnqueueKernelLaunch () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libpi_level_zero.so
#10 0x00007fffb23816f0 in piEnqueueKernelLaunch () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libpi_level_zero.so
#11 0x00007ffff638d891 in _pi_result sycl::_V1::detail::plugin::call_nocheck<(sycl::_V1::detail::PiApiKind)76, _pi_queue*, _pi_kernel*, unsigned long, unsigned long*, unsigned long*, unsigned long*, unsigned long, _pi_event**, _pi_event**>(_pi_queue*, _pi_kernel*, unsigned long, unsigned long*, unsigned long*, unsigned long*, unsigned long, _pi_event**, _pi_event**) const () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#12 0x00007ffff6384184 in sycl::_V1::detail::enqueueImpKernel(std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::NDRDescT&, std::vector<sycl::_V1::detail::ArgDesc, std::allocator<sycl::_V1::detail::ArgDesc> >&, std::shared_ptr<sycl::_V1::detail::kernel_bundle_impl> const&, std::shared_ptr<sycl::_V1::detail::kernel_impl> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<_pi_event*, std::allocator<_pi_event*> >&, std::shared_ptr<sycl::_V1::detail::event_impl> const&, std::function<void* (sycl::_V1::detail::AccessorImplHost*)> const&, _pi_kernel_cache_config) () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#13 0x00007ffff6389a02 in sycl::_V1::detail::ExecCGCommand::enqueueImpQueue() () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#14 0x00007ffff6370f00 in sycl::_V1::detail::Command::enqueue(sycl::_V1::detail::EnqueueResultT&, sycl::_V1::detail::BlockingT, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&) ()
   from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#15 0x00007ffff639aa14 in sycl::_V1::detail::Scheduler::GraphProcessor::enqueueCommand(sycl::_V1::detail::Command*, std::shared_lock<std::shared_timed_mutex>&, sycl::_V1::detail::EnqueueResultT&, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::Command*, sycl::_V1::detail::BlockingT) () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#16 0x00007ffff6395c7c in sycl::_V1::detail::Scheduler::enqueueCommandForCG(std::shared_ptr<sycl::_V1::detail::event_impl>, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::BlockingT)
    () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#17 0x00007ffff63953f8 in sycl::_V1::detail::Scheduler::addCG(std::unique_ptr<sycl::_V1::detail::CG, std::default_delete<sycl::_V1::detail::CG> >, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, _pi_ext_command_buffer*, std::vector<unsigned int, std::allocator<unsigned int> > const&) () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#18 0x00007ffff63cddb2 in sycl::_V1::handler::finalize() () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#19 0x00007ffff6357b51 in void sycl::_V1::detail::queue_impl::finalizeHandler<sycl::_V1::handler>(sycl::_V1::handler&, sycl::_V1::detail::CG::CGTYPE const&, sycl::_V1::event&) ()
   from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#20 0x00007ffff6357551 in sycl::_V1::detail::queue_impl::submit_impl(std::function<void (sycl::_V1::handler&)> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::code_location const&, std::function<void (bool, bool, sycl::_V1::event&)> const*) () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#21 0x00007ffff63fb106 in sycl::_V1::detail::queue_impl::submit(std::function<void (sycl::_V1::handler&)> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::code_location const&, std::function<void (bool, bool, sycl::_V1::event&)> const*) () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#22 0x00007ffff63fb0c5 in sycl::_V1::queue::submit_impl(std::function<void (sycl::_V1::handler&)>, sycl::_V1::detail::code_location const&) () from /space/pvelesko/install/oneapi/compiler/2024.0/lib/libsycl.so.7
#23 0x00007ffff10b9ee0 in oneapi::mkl::gpu::launch_kernel_3D(int*, sycl::_V1::queue*, mkl_gpu_kernel_struct_t*, mkl_gpu_argument_t*, unsigned long*, unsigned long*, mkl_gpu_event_list_t*) ()
   from /space/pvelesko/install/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#24 0x00007ffff10afa11 in oneapi::mkl::gpu::have_binary_kernels(int*, sycl::_V1::queue*) () from /space/pvelesko/install/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#25 0x00007ffff296aaac in oneapi::mkl::gpu::mkl_blas_gpu_sgemm_driver_sycl(int*, sycl::_V1::queue*, oneapi::mkl::gpu::blas_arg_usm_t*, mkl_gpu_event_list_t*) () from /space/pvelesko/install/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#26 0x00007ffff295593b in oneapi::mkl::gpu::sgemm_sycl_internal(sycl::_V1::queue*, MKL_LAYOUT, MKL_TRANSPOSE, MKL_TRANSPOSE, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&, long, long, long) () from /space/pvelesko/install/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#27 0x00007ffff2953f5d in oneapi::mkl::gpu::sgemm_sycl(sycl::_V1::queue*, MKL_LAYOUT, MKL_TRANSPOSE, MKL_TRANSPOSE, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&, long, long, long) () from /space/pvelesko/install/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#28 0x00007ffff3393d22 in oneapi::mkl::blas::sgemm(sycl::_V1::queue&, MKL_LAYOUT, oneapi::mkl::transpose, oneapi::mkl::transpose, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&) () from /space/pvelesko/install/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#29 0x00007ffff331bf2f in oneapi::mkl::blas::column_major::gemm(sycl::_V1::queue&, oneapi::mkl::transpose, oneapi::mkl::transpose, long, long, long, oneapi::mkl::value_or_pointer<float>, float const*, long, float const*, long, oneapi::mkl::value_or_pointer<float>, float*, long, oneapi::mkl::blas::compute_mode, std::vector<sycl::_V1::event, std::allocator<sycl::_V1::event> > const&) () from /space/pvelesko/install/oneapi/mkl/2024.0/lib/libmkl_sycl_blas.so.4
#30 0x00007ffff60b94e1 in oneapi::mkl::blas::column_major::gemm (queue=..., transa=oneapi::mkl::transpose::nontrans, transb=oneapi::mkl::transpose::nontrans, m=140737488299904, m@entry=10, n=0, k=0, alpha=..., a=<optimized out>,
    lda=<optimized out>, b=<optimized out>, ldb=8, beta=..., c=<optimized out>, ldc=582, dependencies=...) at /home/pvelesko/space/install/oneapi/mkl/2024.0/include/oneapi/mkl/blas/usm_decls.hpp:38
#31 onemkl_gemm (my_queue=..., A=<optimized out>, B=<optimized out>, C=C@entry=0xffffd556aa7c0000, m=m@entry=10, n=<optimized out>, k=10, ldA=10, ldB=10, ldC=10, alpha=<optimized out>, beta=-7457682)
    at /home/pvelesko/space/chipStar/test-mkl/samples/hip_sycl_interop_no_buffers/onemkl_gemm_wrapper_no_buffers/onemkl_gemm_wrapper.cpp:63
#32 0x00007ffff60ba266 in oneMKLGemmTest (nativeHandlers=<optimized out>, hip_backend=<optimized out>, A=<optimized out>, B=<optimized out>, C=<optimized out>, M=<optimized out>, N=<optimized out>, K=<optimized out>, ldA=<optimized out>,
    ldB=<optimized out>, ldC=<optimized out>, alpha=<optimized out>, beta=<optimized out>) at /home/pvelesko/space/chipStar/test-mkl/samples/hip_sycl_interop_no_buffers/onemkl_gemm_wrapper_no_buffers/onemkl_gemm_wrapper.cpp:132
#33 0x0000555555556716 in main () at /home/pvelesko/space/chipStar/test-mkl/samples/hip_sycl_interop_no_buffers/hip_sycl_interop.cpp:121
pvelesko commented 7 months ago

fixed by just setting isImmCmdList = false