Closed emmenlau closed 7 months ago
One of the things you can try is identifying one or more instances of clang-cl that hang, and attempt to make it crash, so it produces test case files (preprocessed .cpp and .sh). On Linux you would simply send a SIGABRT
to the process, and this would cause the clang-cl driver to create test case files, but on Windows I am unsure.
Thanks a lot @DimitryAndric for this suggestion! If you or someone knows how I can create the test case files (on Windows), I would be happy to do so!
I observe the same issue with 17.0.4. Trying to install opencv4 via vcpkg using my clang-cl toolchain just hangs forever.
Did anyone try with 17.0.5 yet?
Did anyone try with 17.0.5 yet?
Same issue.
@emmenlau Did you check your build logs carefully? I noticed when bisceting that there was an ICE somewhere at the start of the logs which I didn't notice at first. I don't know why ninja did not abort the build in this case and left it hanging.
Hi @Neumann-A , I checked by logs back then quite carefully: I compared them with a diff-view with a working build using clang 16.x. As far as I could see there was no ICE in my case with clang 17.0.3. But its interesting that you could move one step ahead! One thing I found (but can not try myself): there is an option where clang would print the full diagnostics even without a crash! Maybe this can help developers isolate the problem?
Here is link: https://clang.llvm.org/docs/UsersManual.html#options-to-control-clang-crash-diagnostics
From that page, I quote:
Clang is also capable of generating preprocessed source file(s) and associated run script(s) even without a crash. This is specially useful when trying to generate a reproducer for warnings or errors while using modules.
I guess with such a reproducer, the developers could help resolve the issue...
Did anyone try clang-cl 17.0.6 yet?
Did anyone try clang-cl 17.0.6 yet?
Same issue. Master from end of November also same issue. I don't think it will be fixed for 18.
https://github.com/backengineering/llvm-msvc doesn't seem to have the issue
Thanks @Neumann-A ! I'll try to run the build today with -gen-reproducer
in the hope that devs will consider fixing this issue.
I mean I even bisected the issue. The ICE/hang happens since https://github.com/llvm/llvm-project/commit/0efe111365ae176671e01252d24028047d807a84. Reverting it fixes it.
Oh my, that is very relevant! Thanks for sharing!!!
Dear LLVM devs, could someone kindly consider this compiler hang? It is quite relevant to build openCV, which is a rather relevant library in the image analysis community...
Somebody still needs to provide a .sh and .cpp file from one of those hanging builds. This is essential for reproducing the problem, and attempting to fix it.
Also, ping @phoebewang @efriedma-quic @tentzen who originated https://reviews.llvm.org/D102817 for commit 0efe111365ae176671e01252d24028047d807a84.
Somebody still needs to provide a .sh and .cpp file from one of those hanging builds. This is essential for reproducing the problem, and attempting to fix it.
I managed to configure and build opencv on Windows against Visual Studio and the official LLVM 17.0.6 package, and could intermittently reproduce the hangs: that is, some clang-cl instances hung but not consistently when you repeated the exact same command line.
After retrieving the exact command line for a hanging instance, I could generate a preprocessed test case, where the hang occurs when compiling the intermediate .bc to .asm:
"C:\Program Files\LLVM\bin\clang-cl.exe" -cc1 -triple x86_64-pc-windows-msvc19.38.33134 -S -save-temps=cwd -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name execution_engine.cpp -mrelocation-model pic -pic-level 2 -mframe-pointer=none -relaxed-aliasing -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -target-feature +sse -target-feature +sse2 -target-feature +sse3 -mllvm -x86-asm-syntax=intel -tune-cpu generic -D_MT -D_DLL --dependent-lib=msvcrt --dependent-lib=oldnames --show-includes -sys-header-deps -stack-protector 2 -fexceptions -fasync-exceptions -fms-volatile -fdiagnostics-format msvc -v -ffunction-sections "-fcoverage-compilation-dir=C:\Users\Dim\Source\opencv\build" -resource-dir "C:\PROGRA~1\LLVM\lib\clang\17" -O2 -WCL4 -W -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winconsistent-missing-override -Wno-delete-non-virtual-dtor -Wno-unnamed-type-template-args -Wno-comment -Wno-deprecated-enum-enum-conversion -Wno-deprecated-anon-enum-enum-conversion -Wno-long-long "-fdebug-compilation-dir=C:\Users\Dim\Source\opencv\build" -ferror-limit 19 -fmessage-length=178 -fno-use-cxa-atexit -fms-extensions -fms-compatibility -fms-compatibility-version=19.38.33134 -fdelayed-template-parsing -finline-functions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -o execution_engine.asm -x ir execution_engine.bc
I transported this test case to Linux where I have more tools to do reduction, and I ended up with the following reduced test case:
// clang-cl -cc1 -triple x86_64-pc-windows-msvc19.38.33134 -S -disable-llvm-verifier -fexceptions -fasync-exceptions -O2 execution_engine-min.cpp
template <bool, class _Ty1, class> using conditional_t = _Ty1;
template <class _Ty1, class _Ty2>
constexpr bool is_same_v = __is_same(_Ty1, _Ty2);
struct _Alloc_construct_ptr {
~_Alloc_construct_ptr();
};
template <class _Alnode> struct _List_node_emplace_op2 : _Alloc_construct_ptr {
_List_node_emplace_op2(_Alnode);
~_List_node_emplace_op2() { ; }
};
int _List;
struct {
template <class... _Valtys>
conditional_t<is_same_v<int, int>, int, int> emplace(_Valtys... _Vals) {
_List_node_emplace_op2(_List, _Vals...);
}
} m_executableDependencies;
void ExecutionEngineaddExecutableDependency() {
m_executableDependencies.emplace();
}
This reliably produces an assertion in WinEHPrepare.cpp
(if the llvm in question is compiled with assertions, which the release builds are not), after an initial "A single unwind edge may only enter one EH pad" error:
A single unwind edge may only enter one EH pad
invoke void @llvm.seh.scope.end()
to label %"??1?$_List_node_emplace_op2@H@@QEAA@XZ.exit.i" unwind label %ehcleanup.i.i
Assertion failed: (!verifyFunction(F, &dbgs())), function prepareExplicitEH, file /share/dim/src/llvm/llvm-project/llvm/lib/CodeGen/WinEHPrepare.cpp, line 1210.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl -cc1 -triple x86_64-pc-windows-msvc19.38.33134 -S -disable-llvm-verifier -fexceptions -fasync-exceptions -O2 execution_engine-min.cpp
1. <eof> parser at end of file
2. Code generation
3. Running pass 'Function Pass Manager' on module 'execution_engine-min.cpp'.
4. Running pass 'Windows exception handling preparation' on function '@"?ExecutionEngineaddExecutableDependency@@YAXXZ"'
#0 0x00000000042d05c8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x42d05c8)
#1 0x00000000042ce129 llvm::sys::RunSignalHandlers() (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x42ce129)
#2 0x00000000042d0dc8 SignalHandler(int) Signals.cpp:0:0
#3 0x00000008297c6490 handle_signal /share/dim/src/freebsd/llvm-18-update/lib/libthr/thread/thr_sig.c:0:3
#4 0x00000008297c5a4b thr_sighandler /share/dim/src/freebsd/llvm-18-update/lib/libthr/thread/thr_sig.c:245:1
#5 0x00000008290772d3 ([vdso]+0x2d3)
#6 0x000000082e398e1a _thr_kill /usr/obj/share/dim/src/freebsd/llvm-18-update/amd64.amd64/lib/libc/thr_kill.S:4:0
#7 0x000000082e312a94 __raise /share/dim/src/freebsd/llvm-18-update/lib/libc/gen/raise.c:0:10
#8 0x000000082e3c5799 abort /share/dim/src/freebsd/llvm-18-update/lib/libc/stdlib/abort.c:67:17
#9 0x000000082e2f5d81 (/lib/libc.so.7+0x99d81)
#10 0x0000000003c3cc1a (anonymous namespace)::WinEHPrepareImpl::prepareExplicitEH(llvm::Function&) WinEHPrepare.cpp:0:0
#11 0x0000000003c389b1 (anonymous namespace)::WinEHPrepare::runOnFunction(llvm::Function&) WinEHPrepare.cpp:0:0
#12 0x0000000003defcb1 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x3defcb1)
#13 0x0000000003df82a4 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x3df82a4)
#14 0x0000000003df086e llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x3df086e)
#15 0x0000000004a43178 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4a43178)
#16 0x0000000004a62d19 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4a62d19)
#17 0x000000000671a8c6 clang::ParseAST(clang::Sema&, bool, bool) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x671a8c6)
#18 0x0000000004e61883 clang::FrontendAction::Execute() (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4e61883)
#19 0x0000000004dd63cd clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4dd63cd)
#20 0x0000000004f39305 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x4f39305)
#21 0x0000000002722edc cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x2722edc)
#22 0x000000000271fcd2 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#23 0x000000000271eb3d clang_main(int, char**, llvm::ToolContext const&) (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x271eb3d)
#24 0x000000000272ea14 main (/home/dim/obj/llvmorg-18-init-18014-g490a09a02e81-freebsd15-amd64-ninja-clang-rel-1/bin/clang-cl+0x272ea14)
#25 0x000000082e2e734a __libc_start1 /share/dim/src/freebsd/llvm-18-update/lib/libc/csu/libc_start1.c:157:2
This is the same error and assertion reported in #73536 and #73538.
cc @robertcox-github
Is a1f4ac7 in ClangCl 18.1.4? Because I still can not build OpenCV, it still hangs :-( :-(
Is a1f4ac7 in ClangCl 18.1.4? Because I still can not build OpenCV, it still hangs :-( :-(
Not, it's not getting cherry-picked to 18 release.
Thanks a lot for the feedback @phoebewang ! And there are probably strong reasons against picking it for the 18.x series, yes? It would be really great to have OpenCV build working again... :-(
The recently released Microsoft Visual Studio 2022 17.9.10 now ships with headers that only support LLVM 17 and up, compounding the issue.
Developers impacted by this issue, using clang-cl on Windows with using the latest Microsoft IDE/header, will be soon be stuck:
Thanks a lot for the feedback @phoebewang ! And there are probably strong reasons against picking it for the 18.x series, yes? It would be really great to have OpenCV build working again... :-(
Backports to 18.1.x were stopped recently, so it'll be necessary to wait for 19.
This specifically only impacts code built with /EHa; does OpenCV really need to be built with that flag?
This specifically only impacts code built with /EHa; does OpenCV really need to be built with that flag?
Right. The /EHa
was a dud in LLVM before 17. Removing it should have no side effect.
@efriedma-quic , thanks a billion for this insight! No, I do not require /EHa
and can happily build without. This reduces the severity of this issue a whole lot!
This does not work for me. I've replaced /EHa
with /EHsc
and the compiler (clang-cl 18.1.6 from 5 days ago) still hangs (currently 15 minutes on a single source file, and counting).
Did you mean to disable all exception handling? Or are there exception handling models that are expected to work?
/EHsc
would be an independent issue. How about just removing /EHa
?
After removing /EHa
the build complains that exceptions are not enabled (unless I'm just unable to configure OpenCV correctly and did a mistake, haha). Is that likely possible that removing /EHa
alltogether disables exceptions?
No sure, what I know is /EHa
doesn't really enable asynchronous exceptions before LLVM17. Maybe it enabled partial or maybe the build script just checking the /EH
strings?
I've spend some time reading up on this, and I'm under the impression that any of the /EHx
options needs to be added, otherwise exceptions are turned off by the compiler. See for example here and here.
This explains why removing /EHa
disabled exceptions in opencv alltogether, thereby breaking the build.
/EHsc
would be an independent issue.
Can you elaborate about an independent issue? So does the current fix in clang 19 not address the issue of a compiler hang with /EHsc
? Should I report it as a new issue?
This explains why removing
/EHa
disabled exceptions in opencv alltogether, thereby breaking the build.
I'm not expert of Clang driver. Just took a quick look, maybe you can try -Xclang -fcxx-exceptions -Xclang -fexceptions
to enable exceptions without a /EH*
?
Can you elaborate about an independent issue? So does the current fix in clang 19 not address the issue of a compiler hang with
/EHsc
? Should I report it as a new issue?
I didn't see it else where. My justification is 1) /EHsc
would not generate such llvm.seh.*
intrinsics 2) the fixed issue is a crash issue if compiler built with assert on. A hang may or may not related with it. Did you check if the /EHsc
option works with trunk code?
I have reported a build issue of opencv in https://github.com/opencv/opencv/issues/24390. If needed I can clone the information here. Sadly, I am unable to provide a minimal reproducer, and building opencv is slightly more involved.
It would be great if somebody could still look at this, as opencv is quite a relevant library...