llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.99k stars 11.95k forks source link

clang crash when build IPEX code. #75428

Open xuhancn opened 11 months ago

xuhancn commented 11 months ago

Build cmd:

cd /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu && /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp

The error msg:

Stack dump:
0.      Program arguments: /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp
1.      <eof> parser at end of file
2.      Per-function optimization
3.      Running pass 'Early CSE' on function '@.omp_outlined.'
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x1f)[0x7fa30e5db4ff]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0x50)[0x7fa30e5d97b0]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15CleanupOnSignalEm+0xdd)[0x7fa30e5dac4d]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x8d6e60)[0x7fa30e530e60]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fa314daa420]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x173ee23)[0x7fa30f398e23]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm19SimplifyInstructionEPNS_11InstructionERKNS_13SimplifyQueryEPNS_25OptimizationRemarkEmitterE+0x819)[0x7fa30f3a3d09]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x13700b2)[0x7fa30efca0b2]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x13759e4)[0x7fa30efcf9e4]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE+0x466)[0x7fa30e6e0d76]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm6legacy23FunctionPassManagerImpl3runERNS_8FunctionE+0x4e)[0x7fa30e6e049e]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm6legacy19FunctionPassManager3runERNS_8FunctionE+0x156)[0x7fa30e6e0436]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang17EmitBackendOutputERNS_17DiagnosticsEngineERKNS_19HeaderSearchOptionsERKNS_14CodeGenOptionsERKNS_13TargetOptionsERKNS_11LangOptionsERKN4llvm10DataLayoutEPNSE_6ModuleENS_13BackendActionESt10unique_ptrINSE_17raw_pwrite_streamESt14default_deleteISM_EE+0x305b)[0x7fa3136d631b]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(+0x1667e1c)[0x7fa313955e1c]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang8ParseASTERNS_4SemaEbb+0x283)[0x7fa312b43c13]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang14FrontendAction7ExecuteEv+0x48)[0x7fa313fb9e58]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang16CompilerInstance13ExecuteActionERNS_14FrontendActionE+0x621)[0x7fa313f728a1]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang25ExecuteCompilerInvocationEPNS_16CompilerInstanceE+0x66f)[0x7fa31401ddaf]
/usr/bin/clang++(_Z8cc1_mainN4llvm8ArrayRefIPKcEES2_Pv+0x98d)[0x41229d]
/usr/bin/clang++[0x4105b1]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(+0x19d58f2)[0x7fa313cc38f2]
/lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm20CrashRecoveryContext9RunSafelyENS_12function_refIFvvEEE+0xd7)[0x7fa30e530c67]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver10CC1Command7ExecuteEN4llvm8ArrayRefINS2_8OptionalINS2_9StringRefEEEEEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPb+0x13f)[0x7fa313cc2e2f]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver11Compilation14ExecuteCommandERKNS0_7CommandERPS3_+0x2df)[0x7fa313c9b52f]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver11Compilation11ExecuteJobsERKNS0_7JobListERN4llvm15SmallVectorImplISt4pairIiPKNS0_7CommandEEEE+0x7a)[0x7fa313c9b6da]
/lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang6driver6Driver18ExecuteCompilationERNS0_11CompilationERN4llvm15SmallVectorImplISt4pairIiPKNS0_7CommandEEEE+0xdc)[0x7fa313cae93c]
/usr/bin/clang++(main+0x259f)[0x41002f]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fa30d73e083]
/usr/bin/clang++(_start+0x2e)[0x40d7ce]
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
clang: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/fused_bert-089380.cpp
clang: note: diagnostic msg: /tmp/fused_bert-089380.sh
clang: note: diagnostic msg:

********************

original code: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/bert/fused_bert.cpp , and it can build by gcc. diagnostic msg files also attached: fused_bert-089380.zip

asl commented 11 months ago

LLVM 10 is ancient. Does the problem reproduce with latest LLVM?

xuhancn commented 11 months ago

LLVM 10 is ancient. Does the problem reproduce with latest LLVM?

Sure, Wait for a while.

xuhancn commented 11 months ago

Crash 1 on Clang-18, File: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/optim.cpp

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /usr/local/bin/clang-18 -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_FP16_CPU_DEFINITION -DHAVE_AMX_CPU_DEFINITION -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_VNNI_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/optim.cpp
1.      <eof> parser at end of file
2.      Optimizer
 #0 0x0000556ce2f9752f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang-18+0x372552f)
 #1 0x0000556ce2f9557c llvm::sys::CleanupOnSignal(unsigned long) (/usr/local/bin/clang-18+0x372357c)
 #2 0x0000556ce2edc2c8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007f3eff376420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x0000556ce20693f1 simplifyMulInst(llvm::Value*, llvm::Value*, bool, bool, llvm::SimplifyQuery const&, unsigned int) (.constprop.0) InstructionSimplify.cpp:0:0
 #5 0x0000556ce2061741 simplifyInstructionWithOperands(llvm::Instruction*, llvm::ArrayRef<llvm::Value*>, llvm::SimplifyQuery const&, unsigned int) InstructionSimplify.cpp:0:0
 #6 0x0000556ce206b757 llvm::simplifyInstruction(llvm::Instruction*, llvm::SimplifyQuery const&) (/usr/local/bin/clang-18+0x27f9757)
 #7 0x0000556ce2d7e08c (anonymous namespace)::EarlyCSE::processNode(llvm::DomTreeNodeBase<llvm::BasicBlock>*) EarlyCSE.cpp:0:0
 #8 0x0000556ce2d805ed (anonymous namespace)::EarlyCSE::run() EarlyCSE.cpp:0:0
 #9 0x0000556ce2d821f6 llvm::EarlyCSEPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x35101f6)
#10 0x0000556ce31f1446 llvm::detail::PassModel<llvm::Function, llvm::EarlyCSEPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x397f446)
#11 0x0000556ce0aa0c04 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x122ec04)
#12 0x0000556ce299a27e llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x312827e)
#13 0x0000556ce0a94ee6 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x1222ee6)
#14 0x0000556ce2996d20 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x3124d20)
#15 0x0000556ce3201bb7 (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&, clang::BackendConsumer*) BackendUtil.cpp:0:0
#16 0x0000556ce3204ec4 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/usr/local/bin/clang-18+0x3992ec4)
#17 0x0000556ce37d4d5e clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang-18+0x3f62d5e)
#18 0x0000556ce52becf9 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang-18+0x5a4ccf9)
#19 0x0000556ce37d4135 clang::CodeGenAction::ExecuteAction() (/usr/local/bin/clang-18+0x3f62135)
#20 0x0000556ce3a644c1 clang::FrontendAction::Execute() (/usr/local/bin/clang-18+0x41f24c1)
#21 0x0000556ce39deaeb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang-18+0x416caeb)
#22 0x0000556ce3b42b5b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang-18+0x42d0b5b)
#23 0x0000556ce073c79d cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang-18+0xeca79d)
#24 0x0000556ce07350ad ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#25 0x0000556ce381cd3d void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#26 0x0000556ce2edc747 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/local/bin/clang-18+0x366a747)
#27 0x0000556ce381d1dc clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#28 0x0000556ce37e3d8e clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/local/bin/clang-18+0x3f71d8e)
#29 0x0000556ce37e475d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/local/bin/clang-18+0x3f7275d)
#30 0x0000556ce37eebdc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/local/bin/clang-18+0x3f7cbdc)
#31 0x0000556ce0739ac1 clang_main(int, char**, llvm::ToolContext const&) (/usr/local/bin/clang-18+0xec7ac1)
#32 0x0000556ce06411b5 main (/usr/local/bin/clang-18+0xdcf1b5)
#33 0x00007f3efed7b083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#34 0x0000556ce073486e _start (/usr/local/bin/clang-18+0xec286e)
clang-18: error: clang frontend command failed with exit code 139 (use -v to see invocation)
clang version 18.0.0git (https://github.com/llvm/llvm-project.git a2691e363232c011fdaace9fcc094f3cd210f78b)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
5 warnings generated.
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 1631 |         T tmp[in_rows_p * in_cols_p];
      |               ^~~~~~~~~~~~~~~~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/tensor_helper.h:57:18: note: in instantiation of member function 'torch_ipex::tpp::XformExtTPP<c10::BFloat16>::operator()' requested here
   57 |     trans_n2v_tpp(in[n], out[n]);
      |                  ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
 1631 |         T tmp[in_rows_p * in_cols_p];
      |               ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 1631 |         T tmp[in_rows_p * in_cols_p];
      |               ^~~~~~~~~~~~~~~~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/tensor_helper.h:203:16: note: in instantiation of member function 'torch_ipex::tpp::XformExtTPP<float>::operator()' requested here
  203 |       trans_tpp(in[n], out[n]);
      |                ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
 1631 |         T tmp[in_rows_p * in_cols_p];
      |               ^
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   63 |       Tout tmp_C[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:187:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<float, float>::operator()' requested here
  187 |             qkv_gemm_tpp(HS[s1][bn], Wq_V[nk][bn], QL[s1][nk], BN, true);
      |                         ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
   63 |       Tout tmp_C[M * N];
      |                  ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   72 |         Tout tmp[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  834 |       float tmp[cols];
      |                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:318:29: note: in instantiation of member function 'torch_ipex::tpp::AddBiasTPP<float>::operator()' requested here
  318 |                 add_mask_tpp(AM[s21], AS[ls21][0]);
      |                             ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
  834 |       float tmp[cols];
      |                 ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 3190 |     LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
      |                               ^~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
  200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
      |                                                                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:320:28: note: in instantiation of member function 'torch_ipex::tpp::VarSoftMaxFwdTPP<float, float>::operator()' requested here
  320 |             softmax_fwd_tpp(len, AS[0][0], AP[n][ss1]);
      |                            ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: note: function parameter 'S1' with unknown value cannot be used in a constant expression
 3190 |     LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
      |                               ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3189:23: note: declared here
 3189 |   void operator()(int S1, Tin* in, Tout* out) {
      |                       ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 3201 |         LIBXSMM_ALIGNED(float tmp2[S3], 64);
      |                                    ^~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
  200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
      |                                                                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   63 |       Tout tmp_C[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:187:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<c10::BFloat16, c10::BFloat16>::operator()' requested here
  187 |             qkv_gemm_tpp(HS[s1][bn], Wq_V[nk][bn], QL[s1][nk], BN, true);
      |                         ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
   63 |       Tout tmp_C[M * N];
      |                  ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   72 |         Tout tmp[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   63 |       Tout tmp_C[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:315:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<c10::BFloat16, float>::operator()' requested here
  315 |               a_gemm_tpp(QL[s11][n], KL_TV[s21][n], AS[ls21][0], 1);
      |                         ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
   63 |       Tout tmp_C[M * N];
      |                  ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   72 |         Tout tmp[M * N];
      |                  ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  834 |       float tmp[cols];
      |                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:318:29: note: in instantiation of member function 'torch_ipex::tpp::AddBiasTPP<c10::BFloat16>::operator()' requested here
  318 |                 add_mask_tpp(AM[s21], AS[ls21][0]);
      |                             ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
  834 |       float tmp[cols];
      |                 ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 3190 |     LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
      |                               ^~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
  200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
      |                                                                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:320:28: note: in instantiation of member function 'torch_ipex::tpp::VarSoftMaxFwdTPP<float, c10::BFloat16>::operator()' requested here
  320 |             softmax_fwd_tpp(len, AS[0][0], AP[n][ss1]);
      |                            ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: note: function parameter 'S1' with unknown value cannot be used in a constant expression
 3190 |     LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
      |                               ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3189:23: note: declared here
 3189 |   void operator()(int S1, Tin* in, Tout* out) {
      |                       ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
 3201 |         LIBXSMM_ALIGNED(float tmp2[S3], 64);
      |                                    ^~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
  200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
      |                                                                 ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:916:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  916 |     float tmp[cols];
      |               ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:543:26: note: in instantiation of member function 'torch_ipex::tpp::GradBiasTPP<float>::operator()' requested here
  543 |             grad_bias_tpp(dQL[s1][n], prv_grad_bias[n]);
      |                          ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:916:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
  916 |     float tmp[cols];
      |               ^
clang-18: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-18: note: diagnostic msg: /tmp/optim-0c9d5d.cpp
clang-18: note: diagnostic msg: /tmp/optim-0c9d5d.sh
clang-18: note: diagnostic msg:
********************

optim-0c9d5d.zip

xuhancn commented 11 months ago

Crash 2 on Clang-18, File: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/bert/fused_bert.cpp

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /usr/local/bin/clang-18 -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_FP16_CPU_DEFINITION -DHAVE_AMX_CPU_DEFINITION -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_VNNI_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp
1.      <eof> parser at end of file
2.      Per-file LLVM IR generation
3.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:86:32: Generating code for declaration 'torch_ipex::tpp::fused_self_attention_bwd_unpad'
4.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:90:40: LLVM IR generation of compound statement ('{}')
5.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:88:1: LLVM IR generation of compound statement ('{}')
6.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:528:3: LLVM IR generation of compound statement ('{}')
7.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:532:5: LLVM IR generation of compound statement ('{}')
8.      /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:535:7: LLVM IR generation of compound statement ('{}')
 #0 0x000055baaa86d52f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang-18+0x372552f)
 #1 0x000055baaa86b57c llvm::sys::CleanupOnSignal(unsigned long) (/usr/local/bin/clang-18+0x372357c)
 #2 0x000055baaa7b22c8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007fdc92b01420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x000055baa857e8f7 llvm::IRBuilderBase::CreateMul(llvm::Value*, llvm::Value*, llvm::Twine const&, bool, bool) (/usr/local/bin/clang-18+0x14368f7)
 #5 0x000055baaaf4a155 clang::CodeGen::CodeGenFunction::EmitArraySubscriptExpr(clang::ArraySubscriptExpr const*, bool) (/usr/local/bin/clang-18+0x3e02155)
 #6 0x000055baaaf42b07 clang::CodeGen::CodeGenFunction::EmitLValueHelper(clang::Expr const*, clang::CodeGen::KnownNonNull_t) (/usr/local/bin/clang-18+0x3dfab07)
 #7 0x000055baaaf44bf8 clang::CodeGen::CodeGenFunction::EmitArrayToPointerDecay(clang::Expr const*, clang::CodeGen::LValueBaseInfo*, clang::CodeGen::TBAAAccessInfo*) (/usr/local/bin/clang-18+0x3dfcbf8)
 #8 0x000055baaaf97cf8 (anonymous namespace)::ScalarExprEmitter::VisitCastExpr(clang::CastExpr*) CGExprScalar.cpp:0:0
 #9 0x000055baaaf8e363 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) CGExprScalar.cpp:0:0
#10 0x000055baaaf8f587 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/usr/local/bin/clang-18+0x3e47587)
#11 0x000055baaaf32c2e clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/usr/local/bin/clang-18+0x3deac2e)
#12 0x000055baaaf33357 clang::CodeGen::CodeGenFunction::EmitAnyExprToTemp(clang::Expr const*) (/usr/local/bin/clang-18+0x3deb357)
#13 0x000055baaaec1a8b clang::CodeGen::CodeGenFunction::EmitCallArg(clang::CodeGen::CallArgList&, clang::Expr const*, clang::QualType) (/usr/local/bin/clang-18+0x3d79a8b)
#14 0x000055baaaeca583 clang::CodeGen::CodeGenFunction::EmitCallArgs(clang::CodeGen::CallArgList&, clang::CodeGen::CodeGenFunction::PrototypeWrapper, llvm::iterator_range<clang::Stmt::CastIterator<clang::Expr, clang::Expr const* const, clang::Stmt const* const>>, clang::CodeGen::CodeGenFunction::AbstractCallee, unsigned int, clang::CodeGen::CodeGenFunction::EvaluationOrder) (/usr/local/bin/clang-18+0x3d82583)
#15 0x000055baaaf3af8e clang::CodeGen::CodeGenFunction::EmitCall(clang::QualType, clang::CodeGen::CGCallee const&, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot, llvm::Value*) (/usr/local/bin/clang-18+0x3df2f8e)
#16 0x000055baaaf4fd44 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) (/usr/local/bin/clang-18+0x3e07d44)
#17 0x000055baaaf98a08 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) CGExprScalar.cpp:0:0
#18 0x000055baaaf8d420 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) CGExprScalar.cpp:0:0
#19 0x000055baaaf8f587 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/usr/local/bin/clang-18+0x3e47587)
#20 0x000055baaaf32c2e clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/usr/local/bin/clang-18+0x3deac2e)
#21 0x000055baaaf4dd13 clang::CodeGen::CodeGenFunction::EmitIgnoredExpr(clang::Expr const*) (/usr/local/bin/clang-18+0x3e05d13)
#22 0x000055baaab708c3 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a288c3)
#23 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#24 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#25 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#26 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#27 0x000055baaabac283 void clang::CodeGen::RegionCodeGenTy::CallbackFn<clang::CodeGen::CodeGenFunction::EmitOMPParallelDirective(clang::OMPParallelDirective const&)::'lambda2'(clang::CodeGen::CodeGenFunction&, clang::CodeGen::PrePostActionTy&)>(long, clang::CodeGen::CodeGenFunction&, clang::CodeGen::PrePostActionTy&) CGStmtOpenMP.cpp:0:0
#28 0x000055baab025e8b clang::CodeGen::RegionCodeGenTy::operator()(clang::CodeGen::CodeGenFunction&) const (/usr/local/bin/clang-18+0x3edde8b)
#29 0x000055baab02d4d7 (anonymous namespace)::CGOpenMPRegionInfo::EmitBody(clang::CodeGen::CodeGenFunction&, clang::Stmt const*) CGOpenMPRuntime.cpp:0:0
#30 0x000055baaabc2e6b clang::CodeGen::CodeGenFunction::GenerateOpenMPCapturedStmtFunction(clang::CapturedStmt const&, clang::SourceLocation) (/usr/local/bin/clang-18+0x3a7ae6b)
#31 0x000055baab05a87d emitParallelOrTeamsOutlinedFunction(clang::CodeGen::CodeGenModule&, clang::OMPExecutableDirective const&, clang::CapturedStmt const*, clang::VarDecl const*, llvm::omp::Directive, llvm::StringRef, clang::CodeGen::RegionCodeGenTy const&) CGOpenMPRuntime.cpp:0:0
#32 0x000055baab05ab86 clang::CodeGen::CGOpenMPRuntime::emitParallelOutlinedFunction(clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, clang::VarDecl const*, llvm::omp::Directive, clang::CodeGen::RegionCodeGenTy const&) (/usr/local/bin/clang-18+0x3f12b86)
#33 0x000055baaaba5c78 emitCommonOMPParallelDirective(clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, llvm::omp::Directive, clang::CodeGen::RegionCodeGenTy const&, llvm::function_ref<void (clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, llvm::SmallVectorImpl<llvm::Value*>&)> const&) CGStmtOpenMP.cpp:0:0
#34 0x000055baaaba7097 clang::CodeGen::CodeGenFunction::EmitOMPParallelDirective(clang::OMPParallelDirective const&) (/usr/local/bin/clang-18+0x3a5f097)
#35 0x000055baaab70b75 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28b75)
#36 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#37 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#38 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#39 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#40 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#41 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#42 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#43 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#44 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#45 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#46 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#47 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#48 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#49 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#50 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#51 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#52 0x000055baaab76261 clang::CodeGen::CodeGenFunction::EmitIfStmt(clang::IfStmt const&) (/usr/local/bin/clang-18+0x3a2e261)
#53 0x000055baaab70e89 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28e89)
#54 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#55 0x000055baaabe39ad clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/usr/local/bin/clang-18+0x3a9b9ad)
#56 0x000055baaac3e5bd clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang-18+0x3af65bd)
#57 0x000055baaac39fd5 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang-18+0x3af1fd5)
#58 0x000055baaac43bf6 clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbbf6)
#59 0x000055baaac43c0e clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbc0e)
#60 0x000055baaac43c0e clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbc0e)
#61 0x000055baaac461e6 clang::CodeGen::CodeGenModule::Release() (/usr/local/bin/clang-18+0x3afe1e6)
#62 0x000055baab0abf72 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0
#63 0x000055baab0aab04 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang-18+0x3f62b04)
#64 0x000055baacb94cf9 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang-18+0x5a4ccf9)
#65 0x000055baab0aa135 clang::CodeGenAction::ExecuteAction() (/usr/local/bin/clang-18+0x3f62135)
#66 0x000055baab33a4c1 clang::FrontendAction::Execute() (/usr/local/bin/clang-18+0x41f24c1)
#67 0x000055baab2b4aeb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang-18+0x416caeb)
#68 0x000055baab418b5b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang-18+0x42d0b5b)
#69 0x000055baa801279d cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang-18+0xeca79d)
#70 0x000055baa800b0ad ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#71 0x000055baab0f2d3d void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#72 0x000055baaa7b2747 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/local/bin/clang-18+0x366a747)
#73 0x000055baab0f31dc clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#74 0x000055baab0b9d8e clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/local/bin/clang-18+0x3f71d8e)
#75 0x000055baab0ba75d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/local/bin/clang-18+0x3f7275d)
#76 0x000055baab0c4bdc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/local/bin/clang-18+0x3f7cbdc)
#77 0x000055baa800fac1 clang_main(int, char**, llvm::ToolContext const&) (/usr/local/bin/clang-18+0xec7ac1)
#78 0x000055baa7f171b5 main (/usr/local/bin/clang-18+0xdcf1b5)
#79 0x00007fdc92506083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#80 0x000055baa800a86e _start (/usr/local/bin/clang-18+0xec286e)
clang-18: error: clang frontend command failed with exit code 139 (use -v to see invocation)
clang version 18.0.0git (https://github.com/llvm/llvm-project.git a2691e363232c011fdaace9fcc094f3cd210f78b)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
5 warnings generated.
clang-18: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-18: note: diagnostic msg: /tmp/fused_bert-63a747.cpp
clang-18: note: diagnostic msg: /tmp/fused_bert-63a747.sh
clang-18: note: diagnostic msg:

********************

fused_bert-63a747.zip

xuhancn commented 11 months ago

@asl I have clone the latest llvm code and build the compiler. The issue still occurred in clang-18.

jyu2-git commented 10 months ago

https://godbolt.org/z/T51q3898a

llvmbot commented 10 months ago

@llvm/issue-subscribers-openmp

Author: Xu Han (xuhancn)

Build cmd: cd /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu && /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp The error msg: ```cmd Stack dump: 0. Program arguments: /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp 1. <eof> parser at end of file 2. Per-function optimization 3. Running pass 'Early CSE' on function '@.omp_outlined.' /lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x1f)[0x7fa30e5db4ff] /lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0x50)[0x7fa30e5d97b0] /lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm3sys15CleanupOnSignalEm+0xdd)[0x7fa30e5dac4d] /lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x8d6e60)[0x7fa30e530e60] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fa314daa420] /lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x173ee23)[0x7fa30f398e23] /lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm19SimplifyInstructionEPNS_11InstructionERKNS_13SimplifyQueryEPNS_25OptimizationRemarkEmitterE+0x819)[0x7fa30f3a3d09] /lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x13700b2)[0x7fa30efca0b2] /lib/x86_64-linux-gnu/libLLVM-10.so.1(+0x13759e4)[0x7fa30efcf9e4] /lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE+0x466)[0x7fa30e6e0d76] /lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm6legacy23FunctionPassManagerImpl3runERNS_8FunctionE+0x4e)[0x7fa30e6e049e] /lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm6legacy19FunctionPassManager3runERNS_8FunctionE+0x156)[0x7fa30e6e0436] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang17EmitBackendOutputERNS_17DiagnosticsEngineERKNS_19HeaderSearchOptionsERKNS_14CodeGenOptionsERKNS_13TargetOptionsERKNS_11LangOptionsERKN4llvm10DataLayoutEPNSE_6ModuleENS_13BackendActionESt10unique_ptrINSE_17raw_pwrite_streamESt14default_deleteISM_EE+0x305b)[0x7fa3136d631b] /lib/x86_64-linux-gnu/libclang-cpp.so.10(+0x1667e1c)[0x7fa313955e1c] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang8ParseASTERNS_4SemaEbb+0x283)[0x7fa312b43c13] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang14FrontendAction7ExecuteEv+0x48)[0x7fa313fb9e58] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang16CompilerInstance13ExecuteActionERNS_14FrontendActionE+0x621)[0x7fa313f728a1] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang25ExecuteCompilerInvocationEPNS_16CompilerInstanceE+0x66f)[0x7fa31401ddaf] /usr/bin/clang++(_Z8cc1_mainN4llvm8ArrayRefIPKcEES2_Pv+0x98d)[0x41229d] /usr/bin/clang++[0x4105b1] /lib/x86_64-linux-gnu/libclang-cpp.so.10(+0x19d58f2)[0x7fa313cc38f2] /lib/x86_64-linux-gnu/libLLVM-10.so.1(_ZN4llvm20CrashRecoveryContext9RunSafelyENS_12function_refIFvvEEE+0xd7)[0x7fa30e530c67] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver10CC1Command7ExecuteEN4llvm8ArrayRefINS2_8OptionalINS2_9StringRefEEEEEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPb+0x13f)[0x7fa313cc2e2f] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver11Compilation14ExecuteCommandERKNS0_7CommandERPS3_+0x2df)[0x7fa313c9b52f] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZNK5clang6driver11Compilation11ExecuteJobsERKNS0_7JobListERN4llvm15SmallVectorImplISt4pairIiPKNS0_7CommandEEEE+0x7a)[0x7fa313c9b6da] /lib/x86_64-linux-gnu/libclang-cpp.so.10(_ZN5clang6driver6Driver18ExecuteCompilationERNS0_11CompilationERN4llvm15SmallVectorImplISt4pairIiPKNS0_7CommandEEEE+0xdc)[0x7fa313cae93c] /usr/bin/clang++(main+0x259f)[0x41002f] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fa30d73e083] /usr/bin/clang++(_start+0x2e)[0x40d7ce] clang: error: clang frontend command failed due to signal (use -v to see invocation) clang version 10.0.0-4ubuntu1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin clang: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. clang: note: diagnostic msg: ******************** PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: Preprocessed source(s) and associated run script(s) are located at: clang: note: diagnostic msg: /tmp/fused_bert-089380.cpp clang: note: diagnostic msg: /tmp/fused_bert-089380.sh clang: note: diagnostic msg: ******************** ``` original code: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/bert/fused_bert.cpp , and it can build by gcc. diagnostic msg files also attached: [fused_bert-089380.zip](https://github.com/llvm/llvm-project/files/13669158/fused_bert-089380.zip)
AngryLoki commented 9 months ago

For those, who are searching for a workaround, mass-replacing VLA with wrapped arrays helps:

Details

```c++ template class BlockedArray; template class BlockedArray<1, T> { public: T *data; size_t size; constexpr BlockedArray(T *data, std::array sizes) : data{data}, size{sizes[0]} {} constexpr auto operator[](int n) { return &data[size * n]; } constexpr explicit operator bool() const { return data != nullptr; } constexpr auto operator*() const { return data; } }; template class BlockedArray { public: T *data; std::array sizes; constexpr BlockedArray(T *data, std::array sizes) : data{data}, sizes{sizes} {} constexpr auto operator[](int n) const { auto head = head_aux(std::make_index_sequence()); return BlockedArray{&data[sizes[0] * n], head}; } constexpr explicit operator bool() const { return data != nullptr; } template constexpr auto head_aux(std::index_sequence) const { return std::array{sizes[I]...}; } constexpr auto operator*() const { return operator[](0); } }; template struct BlockedArrayDims { constexpr BlockedArrayDims(std::array items) : items(items) {} template constexpr auto items_prepend(size_t t, std::index_sequence) const { return std::array{t, items[I]...}; } public: std::array items; constexpr BlockedArrayDims() = default; constexpr BlockedArrayDims operator[](size_t t) const { return items_prepend(t, std::make_index_sequence()); } }; #define DECL_VLA_PTR_PT(type, name, dims, t) \ auto name = BlockedArray((type *)t, BlockedArrayDims() dims.items) ```

Update: found cleaner solution:

void f(void *a, long n) {
    // this causes crash
    // auto b = reinterpret_cast<float (*)[n]>(a);

    // but this works!
    using array_type = float (*)[n];
    array_type b = reinterpret_cast<array_type>(a);

#pragma omp parallel
    b[0];
}