Open xuhancn opened 11 months ago
LLVM 10 is ancient. Does the problem reproduce with latest LLVM?
LLVM 10 is ancient. Does the problem reproduce with latest LLVM?
Sure, Wait for a while.
Crash 1 on Clang-18, File: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/optim.cpp
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /usr/local/bin/clang-18 -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_FP16_CPU_DEFINITION -DHAVE_AMX_CPU_DEFINITION -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_VNNI_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/optim.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/optim.cpp
1. <eof> parser at end of file
2. Optimizer
#0 0x0000556ce2f9752f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang-18+0x372552f)
#1 0x0000556ce2f9557c llvm::sys::CleanupOnSignal(unsigned long) (/usr/local/bin/clang-18+0x372357c)
#2 0x0000556ce2edc2c8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
#3 0x00007f3eff376420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
#4 0x0000556ce20693f1 simplifyMulInst(llvm::Value*, llvm::Value*, bool, bool, llvm::SimplifyQuery const&, unsigned int) (.constprop.0) InstructionSimplify.cpp:0:0
#5 0x0000556ce2061741 simplifyInstructionWithOperands(llvm::Instruction*, llvm::ArrayRef<llvm::Value*>, llvm::SimplifyQuery const&, unsigned int) InstructionSimplify.cpp:0:0
#6 0x0000556ce206b757 llvm::simplifyInstruction(llvm::Instruction*, llvm::SimplifyQuery const&) (/usr/local/bin/clang-18+0x27f9757)
#7 0x0000556ce2d7e08c (anonymous namespace)::EarlyCSE::processNode(llvm::DomTreeNodeBase<llvm::BasicBlock>*) EarlyCSE.cpp:0:0
#8 0x0000556ce2d805ed (anonymous namespace)::EarlyCSE::run() EarlyCSE.cpp:0:0
#9 0x0000556ce2d821f6 llvm::EarlyCSEPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x35101f6)
#10 0x0000556ce31f1446 llvm::detail::PassModel<llvm::Function, llvm::EarlyCSEPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x397f446)
#11 0x0000556ce0aa0c04 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/clang-18+0x122ec04)
#12 0x0000556ce299a27e llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x312827e)
#13 0x0000556ce0a94ee6 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x1222ee6)
#14 0x0000556ce2996d20 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/clang-18+0x3124d20)
#15 0x0000556ce3201bb7 (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&, clang::BackendConsumer*) BackendUtil.cpp:0:0
#16 0x0000556ce3204ec4 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/usr/local/bin/clang-18+0x3992ec4)
#17 0x0000556ce37d4d5e clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang-18+0x3f62d5e)
#18 0x0000556ce52becf9 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang-18+0x5a4ccf9)
#19 0x0000556ce37d4135 clang::CodeGenAction::ExecuteAction() (/usr/local/bin/clang-18+0x3f62135)
#20 0x0000556ce3a644c1 clang::FrontendAction::Execute() (/usr/local/bin/clang-18+0x41f24c1)
#21 0x0000556ce39deaeb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang-18+0x416caeb)
#22 0x0000556ce3b42b5b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang-18+0x42d0b5b)
#23 0x0000556ce073c79d cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang-18+0xeca79d)
#24 0x0000556ce07350ad ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#25 0x0000556ce381cd3d void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#26 0x0000556ce2edc747 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/local/bin/clang-18+0x366a747)
#27 0x0000556ce381d1dc clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#28 0x0000556ce37e3d8e clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/local/bin/clang-18+0x3f71d8e)
#29 0x0000556ce37e475d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/local/bin/clang-18+0x3f7275d)
#30 0x0000556ce37eebdc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/local/bin/clang-18+0x3f7cbdc)
#31 0x0000556ce0739ac1 clang_main(int, char**, llvm::ToolContext const&) (/usr/local/bin/clang-18+0xec7ac1)
#32 0x0000556ce06411b5 main (/usr/local/bin/clang-18+0xdcf1b5)
#33 0x00007f3efed7b083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#34 0x0000556ce073486e _start (/usr/local/bin/clang-18+0xec286e)
clang-18: error: clang frontend command failed with exit code 139 (use -v to see invocation)
clang version 18.0.0git (https://github.com/llvm/llvm-project.git a2691e363232c011fdaace9fcc094f3cd210f78b)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
5 warnings generated.
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
1631 | T tmp[in_rows_p * in_cols_p];
| ^~~~~~~~~~~~~~~~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/tensor_helper.h:57:18: note: in instantiation of member function 'torch_ipex::tpp::XformExtTPP<c10::BFloat16>::operator()' requested here
57 | trans_n2v_tpp(in[n], out[n]);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
1631 | T tmp[in_rows_p * in_cols_p];
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
1631 | T tmp[in_rows_p * in_cols_p];
| ^~~~~~~~~~~~~~~~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/tensor_helper.h:203:16: note: in instantiation of member function 'torch_ipex::tpp::XformExtTPP<float>::operator()' requested here
203 | trans_tpp(in[n], out[n]);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:1631:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
1631 | T tmp[in_rows_p * in_cols_p];
| ^
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
63 | Tout tmp_C[M * N];
| ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:187:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<float, float>::operator()' requested here
187 | qkv_gemm_tpp(HS[s1][bn], Wq_V[nk][bn], QL[s1][nk], BN, true);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
63 | Tout tmp_C[M * N];
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
72 | Tout tmp[M * N];
| ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
834 | float tmp[cols];
| ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:318:29: note: in instantiation of member function 'torch_ipex::tpp::AddBiasTPP<float>::operator()' requested here
318 | add_mask_tpp(AM[s21], AS[ls21][0]);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
834 | float tmp[cols];
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
3190 | LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
| ^~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
| ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:320:28: note: in instantiation of member function 'torch_ipex::tpp::VarSoftMaxFwdTPP<float, float>::operator()' requested here
320 | softmax_fwd_tpp(len, AS[0][0], AP[n][ss1]);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: note: function parameter 'S1' with unknown value cannot be used in a constant expression
3190 | LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3189:23: note: declared here
3189 | void operator()(int S1, Tin* in, Tout* out) {
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
3201 | LIBXSMM_ALIGNED(float tmp2[S3], 64);
| ^~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
| ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
63 | Tout tmp_C[M * N];
| ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:187:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<c10::BFloat16, c10::BFloat16>::operator()' requested here
187 | qkv_gemm_tpp(HS[s1][bn], Wq_V[nk][bn], QL[s1][nk], BN, true);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
63 | Tout tmp_C[M * N];
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
72 | Tout tmp[M * N];
| ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
63 | Tout tmp_C[M * N];
| ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:315:25: note: in instantiation of member function 'torch_ipex::tpp::BrgemmExtTPP<c10::BFloat16, float>::operator()' requested here
315 | a_gemm_tpp(QL[s11][n], KL_TV[s21][n], AS[ls21][0], 1);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:63:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
63 | Tout tmp_C[M * N];
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
72 | Tout tmp[M * N];
| ^~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:72:18: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:10:
In file included from /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/ext_tpp.h:5:
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
834 | float tmp[cols];
| ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:318:29: note: in instantiation of member function 'torch_ipex::tpp::AddBiasTPP<c10::BFloat16>::operator()' requested here
318 | add_mask_tpp(AM[s21], AS[ls21][0]);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:834:17: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
834 | float tmp[cols];
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
3190 | LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
| ^~~~~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
| ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_fwd_tmpl.h:320:28: note: in instantiation of member function 'torch_ipex::tpp::VarSoftMaxFwdTPP<float, c10::BFloat16>::operator()' requested here
320 | softmax_fwd_tpp(len, AS[0][0], AP[n][ss1]);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3190:31: note: function parameter 'S1' with unknown value cannot be used in a constant expression
3190 | LIBXSMM_ALIGNED(float tmp[S1 * S3], 64);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3189:23: note: declared here
3189 | void operator()(int S1, Tin* in, Tout* out) {
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
3201 | LIBXSMM_ALIGNED(float tmp2[S3], 64);
| ^~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include/libxsmm_macros.h:200:65: note: expanded from macro 'LIBXSMM_ALIGNED'
200 | # define LIBXSMM_ALIGNED(DECL, N) LIBXSMM_ATTRIBUTE(aligned(N)) DECL
| ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:3201:36: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:916:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
916 | float tmp[cols];
| ^~~~
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:543:26: note: in instantiation of member function 'torch_ipex::tpp::GradBiasTPP<float>::operator()' requested here
543 | grad_bias_tpp(dQL[s1][n], prv_grad_bias[n]);
| ^
/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/xsmm_functors.h:916:15: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
916 | float tmp[cols];
| ^
clang-18: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-18: note: diagnostic msg: /tmp/optim-0c9d5d.cpp
clang-18: note: diagnostic msg: /tmp/optim-0c9d5d.sh
clang-18: note: diagnostic msg:
********************
Crash 2 on Clang-18, File: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/bert/fused_bert.cpp
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /usr/local/bin/clang-18 -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_FP16_CPU_DEFINITION -DHAVE_AMX_CPU_DEFINITION -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_VNNI_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp
1. <eof> parser at end of file
2. Per-file LLVM IR generation
3. /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:86:32: Generating code for declaration 'torch_ipex::tpp::fused_self_attention_bwd_unpad'
4. /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp:90:40: LLVM IR generation of compound statement ('{}')
5. /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:88:1: LLVM IR generation of compound statement ('{}')
6. /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:528:3: LLVM IR generation of compound statement ('{}')
7. /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:532:5: LLVM IR generation of compound statement ('{}')
8. /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_self_attention_bwd_tmpl.h:535:7: LLVM IR generation of compound statement ('{}')
#0 0x000055baaa86d52f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang-18+0x372552f)
#1 0x000055baaa86b57c llvm::sys::CleanupOnSignal(unsigned long) (/usr/local/bin/clang-18+0x372357c)
#2 0x000055baaa7b22c8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
#3 0x00007fdc92b01420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
#4 0x000055baa857e8f7 llvm::IRBuilderBase::CreateMul(llvm::Value*, llvm::Value*, llvm::Twine const&, bool, bool) (/usr/local/bin/clang-18+0x14368f7)
#5 0x000055baaaf4a155 clang::CodeGen::CodeGenFunction::EmitArraySubscriptExpr(clang::ArraySubscriptExpr const*, bool) (/usr/local/bin/clang-18+0x3e02155)
#6 0x000055baaaf42b07 clang::CodeGen::CodeGenFunction::EmitLValueHelper(clang::Expr const*, clang::CodeGen::KnownNonNull_t) (/usr/local/bin/clang-18+0x3dfab07)
#7 0x000055baaaf44bf8 clang::CodeGen::CodeGenFunction::EmitArrayToPointerDecay(clang::Expr const*, clang::CodeGen::LValueBaseInfo*, clang::CodeGen::TBAAAccessInfo*) (/usr/local/bin/clang-18+0x3dfcbf8)
#8 0x000055baaaf97cf8 (anonymous namespace)::ScalarExprEmitter::VisitCastExpr(clang::CastExpr*) CGExprScalar.cpp:0:0
#9 0x000055baaaf8e363 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) CGExprScalar.cpp:0:0
#10 0x000055baaaf8f587 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/usr/local/bin/clang-18+0x3e47587)
#11 0x000055baaaf32c2e clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/usr/local/bin/clang-18+0x3deac2e)
#12 0x000055baaaf33357 clang::CodeGen::CodeGenFunction::EmitAnyExprToTemp(clang::Expr const*) (/usr/local/bin/clang-18+0x3deb357)
#13 0x000055baaaec1a8b clang::CodeGen::CodeGenFunction::EmitCallArg(clang::CodeGen::CallArgList&, clang::Expr const*, clang::QualType) (/usr/local/bin/clang-18+0x3d79a8b)
#14 0x000055baaaeca583 clang::CodeGen::CodeGenFunction::EmitCallArgs(clang::CodeGen::CallArgList&, clang::CodeGen::CodeGenFunction::PrototypeWrapper, llvm::iterator_range<clang::Stmt::CastIterator<clang::Expr, clang::Expr const* const, clang::Stmt const* const>>, clang::CodeGen::CodeGenFunction::AbstractCallee, unsigned int, clang::CodeGen::CodeGenFunction::EvaluationOrder) (/usr/local/bin/clang-18+0x3d82583)
#15 0x000055baaaf3af8e clang::CodeGen::CodeGenFunction::EmitCall(clang::QualType, clang::CodeGen::CGCallee const&, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot, llvm::Value*) (/usr/local/bin/clang-18+0x3df2f8e)
#16 0x000055baaaf4fd44 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) (/usr/local/bin/clang-18+0x3e07d44)
#17 0x000055baaaf98a08 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) CGExprScalar.cpp:0:0
#18 0x000055baaaf8d420 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) CGExprScalar.cpp:0:0
#19 0x000055baaaf8f587 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/usr/local/bin/clang-18+0x3e47587)
#20 0x000055baaaf32c2e clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/usr/local/bin/clang-18+0x3deac2e)
#21 0x000055baaaf4dd13 clang::CodeGen::CodeGenFunction::EmitIgnoredExpr(clang::Expr const*) (/usr/local/bin/clang-18+0x3e05d13)
#22 0x000055baaab708c3 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a288c3)
#23 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#24 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#25 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#26 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#27 0x000055baaabac283 void clang::CodeGen::RegionCodeGenTy::CallbackFn<clang::CodeGen::CodeGenFunction::EmitOMPParallelDirective(clang::OMPParallelDirective const&)::'lambda2'(clang::CodeGen::CodeGenFunction&, clang::CodeGen::PrePostActionTy&)>(long, clang::CodeGen::CodeGenFunction&, clang::CodeGen::PrePostActionTy&) CGStmtOpenMP.cpp:0:0
#28 0x000055baab025e8b clang::CodeGen::RegionCodeGenTy::operator()(clang::CodeGen::CodeGenFunction&) const (/usr/local/bin/clang-18+0x3edde8b)
#29 0x000055baab02d4d7 (anonymous namespace)::CGOpenMPRegionInfo::EmitBody(clang::CodeGen::CodeGenFunction&, clang::Stmt const*) CGOpenMPRuntime.cpp:0:0
#30 0x000055baaabc2e6b clang::CodeGen::CodeGenFunction::GenerateOpenMPCapturedStmtFunction(clang::CapturedStmt const&, clang::SourceLocation) (/usr/local/bin/clang-18+0x3a7ae6b)
#31 0x000055baab05a87d emitParallelOrTeamsOutlinedFunction(clang::CodeGen::CodeGenModule&, clang::OMPExecutableDirective const&, clang::CapturedStmt const*, clang::VarDecl const*, llvm::omp::Directive, llvm::StringRef, clang::CodeGen::RegionCodeGenTy const&) CGOpenMPRuntime.cpp:0:0
#32 0x000055baab05ab86 clang::CodeGen::CGOpenMPRuntime::emitParallelOutlinedFunction(clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, clang::VarDecl const*, llvm::omp::Directive, clang::CodeGen::RegionCodeGenTy const&) (/usr/local/bin/clang-18+0x3f12b86)
#33 0x000055baaaba5c78 emitCommonOMPParallelDirective(clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, llvm::omp::Directive, clang::CodeGen::RegionCodeGenTy const&, llvm::function_ref<void (clang::CodeGen::CodeGenFunction&, clang::OMPExecutableDirective const&, llvm::SmallVectorImpl<llvm::Value*>&)> const&) CGStmtOpenMP.cpp:0:0
#34 0x000055baaaba7097 clang::CodeGen::CodeGenFunction::EmitOMPParallelDirective(clang::OMPParallelDirective const&) (/usr/local/bin/clang-18+0x3a5f097)
#35 0x000055baaab70b75 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28b75)
#36 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#37 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#38 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#39 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#40 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#41 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#42 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#43 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#44 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#45 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#46 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#47 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#48 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#49 0x000055baaab76a4b clang::CodeGen::CodeGenFunction::EmitCompoundStmt(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2ea4b)
#50 0x000055baaab76c6c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a2ec6c)
#51 0x000055baaab70836 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28836)
#52 0x000055baaab76261 clang::CodeGen::CodeGenFunction::EmitIfStmt(clang::IfStmt const&) (/usr/local/bin/clang-18+0x3a2e261)
#53 0x000055baaab70e89 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/local/bin/clang-18+0x3a28e89)
#54 0x000055baaab76709 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/local/bin/clang-18+0x3a2e709)
#55 0x000055baaabe39ad clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/usr/local/bin/clang-18+0x3a9b9ad)
#56 0x000055baaac3e5bd clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang-18+0x3af65bd)
#57 0x000055baaac39fd5 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/local/bin/clang-18+0x3af1fd5)
#58 0x000055baaac43bf6 clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbbf6)
#59 0x000055baaac43c0e clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbc0e)
#60 0x000055baaac43c0e clang::CodeGen::CodeGenModule::EmitDeferred() (/usr/local/bin/clang-18+0x3afbc0e)
#61 0x000055baaac461e6 clang::CodeGen::CodeGenModule::Release() (/usr/local/bin/clang-18+0x3afe1e6)
#62 0x000055baab0abf72 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0
#63 0x000055baab0aab04 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang-18+0x3f62b04)
#64 0x000055baacb94cf9 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang-18+0x5a4ccf9)
#65 0x000055baab0aa135 clang::CodeGenAction::ExecuteAction() (/usr/local/bin/clang-18+0x3f62135)
#66 0x000055baab33a4c1 clang::FrontendAction::Execute() (/usr/local/bin/clang-18+0x41f24c1)
#67 0x000055baab2b4aeb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang-18+0x416caeb)
#68 0x000055baab418b5b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang-18+0x42d0b5b)
#69 0x000055baa801279d cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang-18+0xeca79d)
#70 0x000055baa800b0ad ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#71 0x000055baab0f2d3d void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#72 0x000055baaa7b2747 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/local/bin/clang-18+0x366a747)
#73 0x000055baab0f31dc clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#74 0x000055baab0b9d8e clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/local/bin/clang-18+0x3f71d8e)
#75 0x000055baab0ba75d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/local/bin/clang-18+0x3f7275d)
#76 0x000055baab0c4bdc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/local/bin/clang-18+0x3f7cbdc)
#77 0x000055baa800fac1 clang_main(int, char**, llvm::ToolContext const&) (/usr/local/bin/clang-18+0xec7ac1)
#78 0x000055baa7f171b5 main (/usr/local/bin/clang-18+0xdcf1b5)
#79 0x00007fdc92506083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#80 0x000055baa800a86e _start (/usr/local/bin/clang-18+0xec286e)
clang-18: error: clang frontend command failed with exit code 139 (use -v to see invocation)
clang version 18.0.0git (https://github.com/llvm/llvm-project.git a2691e363232c011fdaace9fcc094f3cd210f78b)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
5 warnings generated.
clang-18: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-18: note: diagnostic msg: /tmp/fused_bert-63a747.cpp
clang-18: note: diagnostic msg: /tmp/fused_bert-63a747.sh
clang-18: note: diagnostic msg:
********************
@asl I have clone the latest llvm code and build the compiler. The issue still occurred in clang-18.
@llvm/issue-subscribers-openmp
Author: Xu Han (xuhancn)
For those, who are searching for a workaround, mass-replacing VLA with wrapped arrays helps:
```c++
template
Update: found cleaner solution:
void f(void *a, long n) {
// this causes crash
// auto b = reinterpret_cast<float (*)[n]>(a);
// but this works!
using array_type = float (*)[n];
array_type b = reinterpret_cast<array_type>(a);
#pragma omp parallel
b[0];
}
Build cmd:
cd /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu && /usr/bin/clang++ -DAT_PARALLEL_OPENMP=1 -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dintel_ext_pt_cpu_EXPORTS -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/aten -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/jit -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/utils -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/libxsmm/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/build/Release/csrc/cpu/csrc/cpu/cpu_third_party/ideep/mkl-dnn/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/include -I/home/xu/anaconda3/envs/ipex_cpu/include/python3.12 -I/home/xu/anaconda3/envs/ipex_cpu/include -I/home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/third_party/ideep/mkl-dnn/src/../include -isystem /home/xu/anaconda3/envs/ipex_cpu/lib/python3.12/site-packages/torch/include -fPIC -Wno-narrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-ignored-qualifiers -Wno-attributes -Wno-parentheses -Wno-format -Wno-deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wall -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -Wno-unused-but-set-variable -Wno-uninitialized -DNDEBUG -fopenmp -fno-math-errno -fno-trapping-math -D_GLIBCXX_USE_CXX11_ABI=0 -DUSE_LIBXSMM -DBUILD_IPEX_MAIN_LIB -DHAVE_AVX512_BF16_CPU_DEFINITION -DHAVE_AVX512_VNNI_CPU_DEFINITION -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O2 -std=c++17 -fPIC -DC10_BUILD_MAIN_LIB -MD -MT csrc/cpu/CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -MF CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o.d -o CMakeFiles/intel-ext-pt-cpu.dir/tpp/bert/fused_bert.cpp.o -c /home/xu/conda_spaces/ipex_cpu/frameworks.ai.pytorch.ipex-cpu/csrc/cpu/tpp/bert/fused_bert.cpp
The error msg:
original code: https://github.com/intel/intel-extension-for-pytorch/blob/main/csrc/cpu/tpp/bert/fused_bert.cpp , and it can build by gcc. diagnostic msg files also attached: fused_bert-089380.zip