Closed Quuxplusone closed 3 years ago
Attached CMakeLists.txt
(290 bytes, text/plain): CMakeLists.txt for repro
Attached main.cpp
(383 bytes, text/x-csrc): main.cpp for repro
Apologies for wonky formatting -- I didn't realize that Bugzilla doesn't
support Markdown. Here is a slightly less bad version:
(Reporting on behalf of Alex Reinking, who found this while debugging Halide:)
OVERVIEW:
There is an apparently 100% reproducible crash when using LLVM 10 build
using MSVC 2017 or 2019, for Windows x86 (32-bit), Debug mode only.
The crash appears due to a double-free in the destructor of SmallVector,
perhaps due to a bug in the copy/move/operator= operations of SmallVector.
In particular, the object whose destructor triggers the crash is a lambda
capture-by-value clone of a small-vector that was initially created on the
stack.
The stack trace looks like:
ntdll.dll!77289c43() Unknown
ntdll.dll![Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll] Unknown
[External Code]
demo.exe!llvm::SmallVectorImpl<llvm::LLT>::~SmallVectorImpl<llvm::LLT>() Line 336 C++
demo.exe!llvm::SmallVector<llvm::LLT,4>::~SmallVector<llvm::LLT,4>() Line 844 C++
[External Code]
demo.exe!llvm::LegalityPredicates::all<std::function<bool __cdecl(llvm::LegalityQuery const &)>>(std::function<bool __cdecl(llvm::LegalityQuery const &)> P0, std::function<bool __cdecl(llvm::LegalityQuery const &)> P1) Line 194 C++
demo.exe!llvm::LegalizeRuleSet::actionForCartesianProduct(llvm::LegalizeActions::LegalizeAction Action, std::initializer_list<llvm::LLT> Types0, std::initializer_list<llvm::LLT> Types1) Line 448 C++
demo.exe!llvm::LegalizeRuleSet::legalForCartesianProduct(std::initializer_list<llvm::LLT> Types0, std::initializer_list<llvm::LLT> Types1) Line 518 C++
demo.exe!main() Line 15 C++
In the debugger output window we see:
Invalid address specified to RtlValidateHeap( 02150000, 01B5ECC8 )
which looks a lot like it's trying to free a stack address. The 2nd argument is
the pointer passed to free (stored in SmallVector's BeginX field).
STEPS TO REPEAT:
Note that Visual Studio 16.6.x (the latest) exposes a bug in LLVM that has
now been patched, but prevents it from compiling. The instructions below use
Visual Studio 15.9.23 (2017) instead.
## Without vcpkg
Get and compile LLVM:
D:\>"C:\Program Files (x86)\Microsoft Visual
Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat" amd64_x86
D:\>git clone https://github.com/llvm/llvm-project.git --depth 1 -b
release/10.x
D:\>mkdir llvm-x86
D:\llvm-x86>cmake -G Ninja ^
-DCMAKE_BUILD_TYPE=Debug ^
-DCMAKE_INSTALL_PREFIX=../llvm-x86-install ^
-DLLVM_ENABLE_TERMINFO=OFF ^
-DLLVM_TARGETS_TO_BUILD=X86 ^
-DLLVM_ENABLE_ASSERTIONS=ON ^
-DLLVM_ENABLE_EH=ON ^
-DLLVM_ENABLE_RTTI=ON ^
-DLLVM_BUILD_32_BITS=ON ^
..\llvm-project\llvm
D:\llvm-x86>cmake --build . --target install
D:\llvm-x86>cd ..
D:\>cd this-gist
D:\this-gist>mkdir build
D:\this-gist\build>cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug
-DCMAKE_PREFIX_PATH=D:/llvm-x86-install ..
D:\this-gist\build>demo
BOOM!
## With vcpkg
Compile with:
> mkdir build
> cd build
> cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_TOOLCHAIN_FILE=D:/vcpkg/scripts/buildsystems/vcpkg.cmake ..
Compiler info:
Microsoft (R) C/C++ Optimizing Compiler Version 19.26.28806 for x86
Copyright (C) Microsoft Corporation. All rights reserved.
CMake info:
cmake version 3.17.2
vcpkg list (only x86-windows is relevant):
glew:x64-windows 2.1.0-7 The OpenGL Extension Wrangler Library (GLEW) is ...
glew:x86-windows 2.1.0-7 The OpenGL Extension Wrangler Library (GLEW) is ...
libjpeg-turbo:x64-windows 2.0.4 libjpeg-turbo is a JPEG image codec that uses SI...
libjpeg-turbo:x86-windows 2.0.4 libjpeg-turbo is a JPEG image codec that uses SI...
libpng:x64-windows 1.6.37-7 libpng is a library implementing an interface fo...
libpng:x86-windows 1.6.37-7 libpng is a library implementing an interface fo...
llvm:x64-windows 10.0.0 The LLVM Compiler Infrastructure
llvm:x86-windows 10.0.0 The LLVM Compiler Infrastructure
llvm[clang-tools-extra]:x64-windows Build Clang tools.
llvm[clang-tools-extra]:x86-windows Build Clang tools.
llvm[clang]:x64-windows Build C Language Family Front-end.
llvm[clang]:x86-windows Build C Language Family Front-end.
llvm[disable-abi-breaking-checks]:x64-windows Build LLVM with LLVM_ABI_BREAKING_CHECKS=FORCE_OFF.
llvm[disable-abi-breaking-checks]:x86-windows Build LLVM with LLVM_ABI_BREAKING_CHECKS=FORCE_OFF.
llvm[disable-assertions]:x64-windows Build LLVM without assertions.
llvm[disable-assertions]:x86-windows Build LLVM without assertions.
llvm[disable-clang-static-analyzer]:x64-windows Build without static analyzer.
llvm[disable-clang-static-analyzer]:x86-windows Build without static analyzer.
llvm[enable-rtti]:x64-windows Build LLVM with run-time type information.
llvm[enable-rtti]:x86-windows Build LLVM with run-time type information.
llvm[lld]:x64-windows Build LLVM linker.
llvm[lld]:x86-windows Build LLVM linker.
llvm[target-all]:x64-windows Build with all backends.
llvm[target-all]:x86-windows Build with all backends.
llvm[tools]:x64-windows Build LLVM tools.
llvm[tools]:x86-windows Build LLVM tools.
openblas:x64-windows 0.3.9-1 OpenBLAS is an optimized BLAS library based on G...
opencl:x64-windows 2.2-2 C/C++ headers and ICD loader (Installable Client...
opengl:x64-windows 0.0-5 Open Graphics Library (OpenGL)[3][4][5] is a cro...
opengl:x86-windows 0.0-5 Open Graphics Library (OpenGL)[3][4][5] is a cro...
zlib:x64-windows 1.2.11-6 A compression library
zlib:x86-windows 1.2.11-6 A compression library
Is there anyone with expertise in this area who could take a look? It's an obscure corner case, but it's one that our project would still like to support.
Bumping this again. Can someone take a look and/or respond, please?
I notice this code does
auto builder = LI.getActionDefinitionsBuilder(TargetOpcode::G_PTRTOINT);
but every place in tree that does this has a '&' with the auto like this
auto &builder = LI.getActionDefinitionsBuilder(TargetOpcode::G_PTRTOINT);
getActionsDefinitionBuilder returns a reference but I think auto may not be
deducing a reference so its making a copy of the builder.
Craig, thanks for your response. Unfortunately, changing that line to read
"auto&" instead of plain "auto" had no effect.
This example was minimized from this call to getActionDefinitionsBuilder in
X86LegalizerInfo::setLegalizerInfo32bit.
----
if (!Subtarget.is64Bit()) {
getActionDefinitionsBuilder(G_PTRTOINT)
.legalForCartesianProduct({s1, s8, s16, s32}, {p0})
.maxScalar(0, s32)
.widenScalarToNextPow2(0, /*Min*/ 8);
----
I am not a C++ expert, but perhaps MSVC in Debug mode is preserving a copy that
is allowed by the standard but most other compilers optimize away?
Bumping this again. We might not be able to release for Win32 if we can't get to the bottom of this.
There are some interesting aspects of MSVC's x86-32 calling conventions that could explain these symptoms. I'm familiar with them, and I would investigate, but my Windows machine is out of commission at the moment.
If anyone else wants to experiment, try to avoid passing complex things (things containing SmallVector members) by value.
To clarify: this is LLVM 10 *only*, but not 11 or trunk, yes?
I have not tested with LLVM 11 or trunk yet.
I reproduced this and debugged this a bit. So far, this feels like an MSVC compiler bug.
This SmallVector<LLT, 4>
object is captured by value by a lambda, which is then moved into a std::function wrapper:
https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp#L25
LegalityPredicate
LegalityPredicates::typeInSet(unsigned TypeIdx,
std::initializer_list
I replaced the SmallVector with my own wrapper object with instrumented ctor/dtor operations:
struct Wrapper {
Wrapper(std::initializer_list
This produces the following logs from a single call to typeInSet:
Wrapper() 0x8d7a4c0 instances 1 copy Wrapper 0x8d7a490 from 0x8d7a4c0 instances 2 move Wrapper 0x8fec228 from 0x8d7a450 instances 3 ~Wrapper() 0x8d7a450 instances 2 ~Wrapper() 0x8d7a4c0 instances 1
Addresses starting with 0x8d... appear to be stack, and 0x8f... appears to be heap. So my reading of this log is:
Note that the source of the move operation is not the same as the address of the copied wrapper (0x8d7a450 vs 0x8d7a490). There are no other Wrapper constructor calls, so how did this address come into existence?
In the debugger, I put a watchpoint on the bytes of the copy-constructed Wrapper. The watchpoint stopped in the implementation of memcpy, confirming the theory that Wrapper is memcpy'd even though it is not a trivially copyable object.
So far as I can tell, this is not a standard library bug. This is the code for the std::function constructor: https://github.com/microsoft/STL/blob/master/stl/inc/functional#L1169 No copy operations to speak of.
I annotated the disassembly here:
0:000> uf 00dd6c87 - 5 - 42
llc!std::function<bool cdecl(llvm::LegalityQuery const &)>::function<bool cdecl(llvm::LegalityQuery const &)><
1125 00dd6c43 83ec08 sub esp,8 1125 00dd6c46 83e4f8 and esp,0FFFFFFF8h 1125 00dd6c49 83c404 add esp,4
1125 00dd6c4c 55 push ebp
1125 00dd6c4d 8b6b04 mov ebp,dword ptr [ebx+4] 1125 00dd6c50 896c2404 mov dword ptr [esp+4],ebp
1125 00dd6c54 8bec mov ebp,esp
1125 00dd6c56 83ec28 sub esp,28h
1125 00dd6c59 57 push edi
1125 00dd6c5a 51 push ecx 1125 00dd6c5b 8d7dd8 lea edi,[ebp-28h] 1125 00dd6c5e b90a000000 mov ecx,0Ah 1125 00dd6c63 b8cccccccc mov eax,0CCCCCCCCh 1125 00dd6c68 f3ab rep stos dword ptr es:[edi] 1125 00dd6c6a 59 pop ecx
this
in ECX1125 00dd6c6b a144b1d407 mov eax,dword ptr [llc!__security_cookie (07d4b144)] 1125 00dd6c70 33c5 xor eax,ebp 1125 00dd6c72 8945fc mov dword ptr [ebp-4],eax
1125 00dd6c75 894df8 mov dword ptr [ebp-8],ecx
this
/ECX to local variable slot on stack1125 00dd6c78 6a20 push 20h 1125 00dd6c7a 8b4308 mov eax,dword ptr [ebx+8] 1125 00dd6c7d 50 push eax 1125 00dd6c7e 8d4dd8 lea ecx,[ebp-28h] 1125 00dd6c81 51 push ecx 1125 00dd6c82 e8565a1604 call llc!memcpy (04f3c6dd) 1125 00dd6c87 83c40c add esp,0Ch
free
a stack address when we destroy this object.1125 00dd6c8a 8d55d8 lea edx,[ebp-28h] 1125 00dd6c8d 895308 mov dword ptr [ebx+8],edx 1125 00dd6c90 8b4df8 mov ecx,dword ptr [ebp-8] 1125 00dd6c93 e8e66ae9ff call llc!ILT+1361785(??0?$_Func_class_NABULegalityQueryllvmstdQAEXZ) (00c6d77e)
1126 00dd6c98 8b4308 mov eax,dword ptr [ebx+8]
1126 00dd6c9b 50 push eax
1126 00dd6c9c e85f2dad03 call llc!std::move<
1126 00dd6ca1 83c404 add esp,4
1126 00dd6ca4 50 push eax
1126 00dd6ca5 8b4df8 mov ecx,dword ptr [ebp-8]
1126 00dd6ca8 e8330dad03 call llc!std::_Func_class<bool,llvm::LegalityQuery const &>::_Reset<
I'll try to reduce it some more.
Here is a reduced bug for the MSVC compiler:
struct Wrapper { Wrapper() : self(this) {} Wrapper(Wrapper &&o) { self = this; } Wrapper(const Wrapper &o) { self = this; } ~Wrapper() { assert(self == this); } __declspec(align(8)) Wrapper *self = nullptr; int ballast[8]; // Increase size to use memcpy }; void receiveWrapper(Wrapper w) { printf("w.self %p\n", w.self); } int main() { receiveWrapper(Wrapper()); }
The Wrapper object models the SmallVector interior object pointer. It tracks its identity with the self
field.
receiveWrapper has a similar prologue to the std::function constructor I annotated in the last comment:
?receiveWrapper@@YAXUWrapper@@@Z (void __cdecl receiveWrapper(struct Wrapper)): 00000000: 53 push ebx 00000001: 8B DC mov ebx,esp 00000003: 83 EC 08 sub esp,8 00000006: 83 E4 F8 and esp,0FFFFFFF8h 00000009: 83 C4 04 add esp,4 0000000C: 55 push ebp 0000000D: 8B 6B 04 mov ebp,dword ptr [ebx+4] 00000010: 89 6C 24 04 mov dword ptr [esp+4],ebp 00000014: 8B EC mov ebp,esp 00000016: 83 EC 28 sub esp,28h 00000019: 6A 28 push 28h 0000001B: 8B 43 08 mov eax,dword ptr [ebx+8] 0000001E: 50 push eax 0000001F: 8D 4D D8 lea ecx,[ebp-28h] 00000022: 51 push ecx 00000023: E8 00 00 00 00 call _memcpy 00000028: 83 C4 0C add esp,0Ch ...
This prologue basically realigns the stack (see the declspec align 8) and copies the bytes of all of the parameters into the new space, along with the return address and EBP to set up a fake stack frame so standard frame pointer stack walks work well.
And, here's the assertion failure:
$ cl -GS- t.cpp && ./t.exe Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29111 for x86 Copyright (C) Microsoft Corporation. All rights reserved.
t.cpp Microsoft (R) Incremental Linker Version 14.27.29111.0 Copyright (C) Microsoft Corporation. All rights reserved.
/out:t.exe t.obj w.self 00FEF8A8 Assertion failed: self == this, file t.cpp, line 8
So, it's not our bug. We could workaround it by force enabling optimization on this particular file.
I did my best to write up the issue here:
https://developercommunity.visualstudio.com/content/problem/1179643/msvc-copies-overaligned-non-trivially-copyable-par.html
Landed a workaround in 4e3edef4b8b637c0c76897497eb7c66f00157210.
_Bug 48054 has been marked as a duplicate of this bug._
_Bug 49097 has been marked as a duplicate of this bug._
CMakeLists.txt
(290 bytes, text/plain)main.cpp
(383 bytes, text/x-csrc)(Reporting on behalf of Alex Reinking, who found this while debugging Halide:)
Overview:
There is an apparently 100% reproducible crash when using LLVM 10 build using MSVC 2017 or 2019, for Windows x86 (32-bit), Debug mode only.
The crash appears due to a double-free in the destructor of SmallVector, perhaps due to a bug in the copy/move/operator= operations of SmallVector. In particular, the object whose destructor triggers the crash is a lambda capture-by-value clone of a small-vector that was initially created on the stack.
The stack trace looks like:
In the debugger output window we see:
Invalid address specified to RtlValidateHeap( 02150000, 01B5ECC8 )
, which looks a lot like it's trying to free a stack address. The 2nd argument is the pointer passed to free (stored in SmallVector's BeginX field).Steps To Repeat:
Note that Visual Studio 16.6.x (the latest) exposes a bug in LLVM that has now been patched, but prevents it from compiling. The instructions below use Visual Studio 15.9.23 (2017) instead.
Without vcpkg
Get and compile LLVM:
With vcpkg
Compile with:
Compiler info:
CMake info:
vcpkg list (only x86-windows is relevant):