llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.12k stars 12.01k forks source link

[DebugInfo@O2][Dexter] Misleading watch value of aggregate member at -O2 optimisation. #38112

Open llvmbot opened 6 years ago

llvmbot commented 6 years ago
Bugzilla Link 38764
Version trunk
OS Linux
Blocks llvm/llvm-project#38116
Reporter LLVM Bugzilla Contributor
CC @chrisjbris,@gregbedwell,@MaskRay,@CarlosAlbertoEnciso,@jmorse,@OCHyams,@pogo59

Extended Description

When the following example:


1 struct SROAThis { 2 SROAThis(int n_a, int n_b, int n_c, int n_d) 3 : a(n_a), b(n_b), c(n_c), d(n_d){} 4 5 int a, b, c, d; 6 }; 7 8 int doesSroaStuff(SROAThis & aggr) { 9 aggr.a += 1;
10 aggr.b += 2;
11 aggr.c += 3;
12 aggr.d += 4;
13 aggr.a += aggr.b + aggr.c + aggr.d;
14 return aggr.a;
15 } 11 12 int main(int argc, const char ** argv) { 13 SROAThis aggr(argc, argc, argc, argc); 14 15 auto a = doesSroaStuff(aggr); 16 17 return a; 18 }


is compiled with the following command:

clang test.cpp -o test.elf -g -O2 -fno-inline

The line table entries/debug locations for doesSroaStuff no longer accurately reflect the source.

The disassembly for doesSroaStuff shows that the constant additions at the start of the function have been interleaved with movs, leas and adds:


// from objdump -M intel -d ./test.elf

0000000000400480 <_Z13doesSroaStuffR8SROAThis>: 400480: 8b 47 04 mov eax,DWORD PTR [rdi+0x4] 400483: 8b 4f 08 mov ecx,DWORD PTR [rdi+0x8] 400486: 83 c0 02 add eax,0x2 400489: 8b 17 mov edx,DWORD PTR [rdi] 40048b: 01 c2 add edx,eax 40048d: 89 47 04 mov DWORD PTR [rdi+0x4],eax 400490: 8d 41 03 lea eax,[rcx+0x3] 400493: 89 47 08 mov DWORD PTR [rdi+0x8],eax 400496: 8b 47 0c mov eax,DWORD PTR [rdi+0xc] 400499: 8d 70 04 lea esi,[rax+0x4] 40049c: 89 77 0c mov DWORD PTR [rdi+0xc],esi 40049f: 8d 0c 11 lea ecx,[rcx+rdx1] 4004a2: 83 c1 03 add ecx,0x3 4004a5: 8d 04 08 lea eax,[rax+rcx1] 4004a8: 83 c0 05 add eax,0x5 4004ab: 89 07 mov DWORD PTR [rdi],eax 4004ad: c3 ret


A line table dump from llvm-dwarfdump shows the following:


0x0000000000400480 8 0 1 0 0 is_stmt 0x0000000000400480 10 10 1 0 0 is_stmt prologue_end 0x0000000000400483 11 10 1 0 0 is_stmt 0x0000000000400486 10 10 1 0 0 is_stmt ---------------------------------------------------------------------------- if we step to line 11 we will only have executed the following load instruction: 400480: 8b 47 04 mov eax,DWORD PTR [rdi+0x4] placing our source mapping at the following line 8 int doesSroaStuff(SROAThis & aggr) { 9 aggr.a += 1; 10 aggr.b += 2; 11 aggr.c += 3; < < < however, no additions have taken place. When debugging using LLDB, I get the following watch values for aggr.a and aggr.b at line 11: ---------------------------------------------------------------------------- Watchpoint 1: addr = 0x7fffffffe4a0 size = 4 state = enabled type = w declare @ '/home/clangbox/dev/temp/test/./sroa_temp.cpp:8' watchpoint spec = 'aggr.a' new value: 1 Watchpoint 2: addr = 0x7fffffffe4a4 size = 4 state = enabled type = w declare @ '/home/clangbox/dev/temp/test/./sroa_temp.cpp:8' watchpoint spec = 'aggr.b' new value: 1 ---------------------------------------------------------------------------- The user should expect to see aggr.a = 2 and aggr.b = 3 if the program is run with no options. If re-implementing this example using none-aggregates but the same structure, such as : ---------------------------------------------------------------------------- int doesSroaStuff(int & a, int & b, int & c, int & d) { a += 1; b += 2; c += 3; d += 4; a += b + c + d; return a; } int main(int argc, const char ** argv) { auto a = doesSroaStuff(argc, argc, argc, argc); return a; } ---------------------------------------------------------------------------- Then doesSroaStuff has the adds in the "correct" order despite being optimised. 0000000000400480 <_Z13doesSroaStuffRiS_S_S_>: 400480: 83 07 01 add DWORD PTR [rdi],0x1 400483: 83 06 02 add DWORD PTR [rsi],0x2 400486: 83 02 03 add DWORD PTR [rdx],0x3 400489: 8b 01 mov eax,DWORD PTR [rcx] 40048b: 83 c0 04 add eax,0x4 40048e: 89 01 mov DWORD PTR [rcx],eax 400490: 03 06 add eax,DWORD PTR [rsi] 400492: 03 02 add eax,DWORD PTR [rdx] 400494: 03 07 add eax,DWORD PTR [rdi] 400496: 89 07 mov DWORD PTR [rdi],eax 400498: c3 ret ----------------------------------------------------------------------------
jmorse commented 2 years ago

mentioned in issue llvm/llvm-project#38116

jmorse commented 3 years ago

I think we didn't realise it at the time, but the root of the problem is bug 34136. Eliminating dead stores to memory fundamentally re-orders the appearance of assignments, leading to misleading values in existing variable locations. Seeing how the first assignment to the 'a' field in comment 2 gets DSE'd, there's not much we can do about this. Most recent discussion on how to fix this:

https://lists.llvm.org/pipermail/llvm-dev/2020-October/145697.html

Although that's more about tracking what happens when stores get deleted, rather than how to represent it in the output.

Another facet of this is jumpy line tables -- Toms original reproducer featured a step backwards. Happily with trunk and the new pass manager, we only have steps forwards.

MaskRay commented 3 years ago

For the code in comment 2,

clang -O2 a.cc -g -gdwarf-5 -fno-legacy-pass-manager -fno-inline -o a

Address Line Column File ISA Discriminator Flags


0x0000000000401110 8 0 0 0 0 is_stmt 0x0000000000401110 9 10 0 0 0 is_stmt prologue_end 0x0000000000401112 10 10 0 0 0 is_stmt 0x000000000040111b 11 10 0 0 0 is_stmt 0x0000000000401122 12 3 0 0 0 is_stmt 0x0000000000401130 15 0 0 0 0 is_stmt 0x000000000040113a 16 5 0 0 0 is_stmt prologue_end 0x000000000040114c 17 13 0 0 0 is_stmt 0x0000000000401154 18 3 0 0 0 is_stmt 0x000000000040115a 18 3 0 0 0 is_stmt end_sequence 0x0000000000401160 2 0 0 0 0 is_stmt 0x0000000000401160 2 23 0 0 0 is_stmt prologue_end 0x0000000000401162 2 30 0 0 0 0x0000000000401165 2 37 0 0 0 0x0000000000401166 2 37 0 0 0 end_sequence

0000000000401110 <_Z3fooR1S>: 401110: 8b 07 movl (%rdi), %eax 401112: 8b 4f 04 movl 4(%rdi), %ecx 401115: 8d 51 44 leal 68(%rcx), %edx 401118: 89 57 04 movl %edx, 4(%rdi) 40111b: 01 c8 addl %ecx, %eax 40111d: 83 c0 77 addl $119, %eax 401120: 89 07 movl %eax, (%rdi) 401122: c3 retq 401123: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax) 40112d: 0f 1f 00 nopl (%rax)

DSEPass drops the store instructions. Then would will be a better representation making values less misleading?

CarlosAlbertoEnciso commented 5 years ago

IR Dump After Dead Store Elimination

define i32 @​_Z3fooR1S(%struct.S %aggr) { entry: ... %a = getelementptr inbounds %struct.S, %struct.S %aggr, i64 0, i32 0, !dbg !​24 %0 = load i32, i32* %a, align 4, !dbg !​24, !tbaa !​25 %add = add nsw i32 %0, 33, !dbg !​24

%b = getelementptr inbounds %struct.S, %struct.S %aggr, i64 0, i32 1, !dbg !​30 %1 = load i32, i32 %b, align 4, !dbg !​30, !tbaa !​31 %add1 = add nsw i32 %1, 44, !dbg !​30 store i32 %add1, i32* %b, align 4, !dbg !​30, !tbaa !​31 ... }

And 'a' is only updated after 'b' is updated before returning.

%add4 = add nsw i32 %add1, %add, !dbg !​32 store i32 %add4, i32* %a, align 4, !dbg !​32, !tbaa !​25 ret i32 %add4, !dbg !​33

CarlosAlbertoEnciso commented 5 years ago

IR Dump Before Dead Store Elimination

define i32 @​_Z3fooR1S(%struct.S %aggr) { entry: ... %a = getelementptr inbounds %struct.S, %struct.S %aggr, i64 0, i32 0, !dbg !​24 %0 = load i32, i32 %a, align 4, !dbg !​24, !tbaa !​25 %add = add nsw i32 %0, 33, !dbg !​24 store i32 %add, i32 %a, align 4, !dbg !​24, !tbaa !​25

%b = getelementptr inbounds %struct.S, %struct.S %aggr, i64 0, i32 1, !dbg !​30 %1 = load i32, i32 %b, align 4, !dbg !​30, !tbaa !​31 %add1 = add nsw i32 %1, 44, !dbg !​30 store i32 %add1, i32* %b, align 4, !dbg !​30, !tbaa !​31 ... }

Before DSE, it is clear that 'a' gets updated.

But during DSE:

DEAD: store i32 %add, i32* %a, align 4, !dbg !​24, !tbaa !​25

CarlosAlbertoEnciso commented 5 years ago

The following modified test case reproduces the issue described in comment 0

-------->8-------- struct S { S(int pa, int pb) : a(pa), b(pb) {}

int a; int b; };

int foo(S &aggr) { aggr.a += 0x33; aggr.b += 0x44; aggr.a += aggr.b; return aggr.a; }

int main() { S aggr(0x11, 0x22); auto af = foo(aggr); return af; } --------8<--------

llvmbot commented 6 years ago

This was found using https://github.com/SNSystems/dexter

llvmbot commented 6 years ago

assigned to @CarlosAlbertoEnciso