intel / intel-graphics-compiler

Other
594 stars 155 forks source link

Undefined Behavior in a call to TranslateImpl() #326

Closed PatKamin closed 4 months ago

PatKamin commented 4 months ago

I have a simple app which builds a program from a kernel compiled with a SPIR-V triple. Here are the commands I've used to compile the kernel:

clang -cc1 -triple spir simple.cl -O2 -finclude-default-header -emit-llvm-bc -o simple.out
llvm-spirv simple.out -o simple.spv

The app is built with clang and the Undefined Behavior sanitizer enabled. During the run I get the following error when creating a module:

/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:165:46: runtime error: member access within address 0x60600017fd20 which does not point to an object of type 'std::_Sp_counted_base<__gnu_cxx::_S_atomic>'
0x60600017fd20: note: object has invalid vptr
 00 00 00 00  58 b3 31 42 9b 76 00 00  03 00 00 00 01 00 00 00  b8 b1 31 42 9b 76 00 00  00 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              invalid vptr
    #0 0x769b4bbb440c in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:165:46
    #1 0x769b3eecd6cf  (/usr/local/lib/libigc.so.1+0xe9d6cf) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #2 0x769b3eed0414  (/usr/local/lib/libigc.so.1+0xea0414) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #3 0x769b3ee540cb  (/usr/local/lib/libigc.so.1+0xe240cb) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #4 0x769b3faa3c00 in llvm::FPPassManager::runOnFunction(llvm::Function&) (/usr/local/lib/libigc.so.1+0x1a73c00) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #5 0x769b3faa3f28 in llvm::FPPassManager::runOnModule(llvm::Module&) (/usr/local/lib/libigc.so.1+0x1a73f28) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #6 0x769b3faa5645 in llvm::legacy::PassManagerImpl::run(llvm::Module&) (/usr/local/lib/libigc.so.1+0x1a75645) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #7 0x769b3eed38ef  (/usr/local/lib/libigc.so.1+0xea38ef) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #8 0x769b3eb0b634  (/usr/local/lib/libigc.so.1+0xadb634) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #9 0x769b3ed98cfa  (/usr/local/lib/libigc.so.1+0xd68cfa) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #10 0x769b3eb0d3b2  (/usr/local/lib/libigc.so.1+0xadd3b2) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #11 0x769b3eb9d48e  (/usr/local/lib/libigc.so.1+0xb6d48e) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #12 0x769b3eba00cb  (/usr/local/lib/libigc.so.1+0xb700cb) (BuildId: 3ca931aa708a97ea41945066820c8c98e6cd005f)
    #13 0x769b4588f549 in std::unique_ptr<IGC::OclTranslationOutput<1ul>, CIF::RAII::ReleaseHelper<IGC::OclTranslationOutput<1ul> > > IGC::IgcOclTranslationCtx<3ul>::Translate<IGC::OclTranslationOutput<1ul> >(CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, unsigned int, void*) /usr/local/include/igc/ocl_igc_interface/igc_ocl_translation_ctx.h:93:29
    #14 0x769b4588f549 in std::unique_ptr<IGC::OclTranslationOutput<1ul>, CIF::RAII::ReleaseHelper<IGC::OclTranslationOutput<1ul> > > NEO::translate<IGC::IgcOclTranslationCtx<3ul> >(IGC::IgcOclTranslationCtx<3ul>*, CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, CIF::Builtins::Buffer<1ul>*, void*) neo/shared/source/compiler_interface/compiler_interface.inl:75:10
    #15 0x769b4588f549 in NEO::CompilerInterface::build(NEO::Device const&, NEO::TranslationInput const&, NEO::TranslationOutput&) neo/shared/source/compiler_interface/compiler_interface.cpp:153:92
    #16 0x769b45518cf5 in L0::ModuleTranslationUnit::compileGenBinary(NEO::TranslationInput&, bool) neo/level_zero/core/source/module/module_imp.cpp:207:47
    #17 0x769b4551b6e1 in L0::ModuleTranslationUnit::buildFromSpirV(char const*, unsigned int, char const*, char const*, _ze_module_constants_t const*) neo/level_zero/core/source/module/module_imp.cpp:290:34
    #18 0x769b45520cab in L0::ModuleImp::initializeTranslationUnit(_ze_module_desc_t const*, NEO::Device*) neo/level_zero/core/source/module/module_imp.cpp:734:57
    #19 0x769b45520cab in L0::ModuleImp::initialize(_ze_module_desc_t const*, NEO::Device*) neo/level_zero/core/source/module/module_imp.cpp:536:57
    #20 0x769b45521abd in L0::Module::create(L0::Device*, _ze_module_desc_t const*, L0::ModuleBuildLog*, L0::ModuleType, _ze_result_t*) neo/level_zero/core/source/module/module_imp.cpp:1241:33
    #21 0x769b4546534d in L0::DeviceImp::createModule(_ze_module_desc_t const*, _ze_module_handle_t**, _ze_module_build_log_handle_t**, L0::ModuleType) neo/level_zero/core/source/device/device_imp.cpp:487:36
    #22 0x769b47494a73 in zeModuleCreate unified-runtime/build/_deps/level-zero-loader-src/source/lib/ze_libapi.cpp:5034:12
    #23 0x769b4824cbef in urProgramBuildExp unified-runtime/source/adapters/level_zero/program.cpp:195:9
    #24 0x769b48249e9d in urProgramBuild unified-runtime/source/adapters/level_zero/program.cpp:116:10

Intel Graphics Compiler: igc-1.0.16238.4 Compute Runtime: 24.09.28717.12 (with all dependencies having versions as described here) GPU: Intel Arc A750 How often does it reproduce? 100% of the time

If you need more information, please let me know.

pszymich commented 4 months ago

Hi @PatKamin, I'm not 100% sure what you are trying to do. There is the standalone simple.cl kernel which you compile with clang, and some SYCL-looking workload. Can you share a minimal reproducer?

PatKamin commented 4 months ago

Hi, I'm closing this issue as while preparing a minimal reproducer I've found that the bug is related to other repo, not igc.