llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.01k stars 11.57k forks source link

Does changing the order of members in the union affect the actual memory content of the union? #107152

Open edumoot opened 2 weeks ago

edumoot commented 2 weeks ago

The test case comes from #107144.

I have a question: in the context of C, would changing the order of elements in a union impact the memory content of the union?

In a union,

Can we conclude that the relative memory layout of the members doesn't change based on the order of definition? (the source code Godblot1 to generate 402_O3.out)

clang -g -O3 -o 402_O3.out 402.c

(lldb) file 402_O3.out
(lldb) b 38
(lldb) r
[...]
(lldb) p global_union
(volatile U3) {
  f0 = (f0 = -1, f1 = 8313)
  f1 = -1
}
(lldb) mem read &global_union
0x555555558030: ff ff ff ff 79 20 00 00 00 00 00 00 00 00 00 00  ....y ..........
0x555555558040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
(lldb) 

Now, we change the order of elements in the union while keeping the remaining at the same and save it as 'test.c' (the source code in Godbolt2 to produce test_O3.out)

union U3 {
   signed f1 ;
   struct S0 f0;  
};

.
.
.

clang -g -O3 -o test_O3.out test.c

(lldb) file test_O3.out
(lldb) b 38
(lldb) r
[...]
lldb) p global_union
(volatile U3) {
  f1 = -1
  f0 = (f0 = -1, f1 = 0)
}
(lldb) mem read &global_union
0x555555558030: ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x555555558040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

The last wo bytes 79 20 changed to 00 00

edumoot commented 2 weeks ago

In GDB, the memory layout appears identical(only showing the first 6-byte), but the values displayed are different.

(gdb) file 402_O3.out
(gdb) b 38
(gdb) r
[...]
(gdb) p global_union
$2 = {f0 = {f0 = -1, f1 = 8313}, f1 = -1}
(gdb) x &global_union
0x555555558030 <global_union>:  0xffffffff

(gdb) file  test_O3.out
(gdb) b 38
(gdb) r
[...]
(gdb) p global_union
$1 = {f1 = -1, f0 = {f0 = -1, f1 = 0}}
(gdb) x &global_union
0x555555558030 <global_union>:  0xffffffff
llvmbot commented 2 weeks ago

@llvm/issue-subscribers-clang-codegen

Author: Yachao Zhu (edumoot)

The test case comes from #107144. I have a question: in the context of C, would changing the order of elements in a union impact the memory content of the union? In a union, - the size of the union is determined by its largest member. - all members share the same memory space. Can we make a conclusion that the relative memory layout of the members doesn't change based on the order of definition? ``` clang -g -O3 -o 402_O3.out 402.c (lldb) file 402_O3.out (lldb) b 38 (lldb) r [...] (lldb) p global_union (volatile U3) { f0 = (f0 = -1, f1 = 8313) f1 = -1 } (lldb) mem read &global_union 0x555555558030: ff ff ff ff 79 20 00 00 00 00 00 00 00 00 00 00 ....y .......... 0x555555558040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (lldb) ``` Now, we change the order of elements in the union while keeping the remaining at the same, and save it as 'test.c'. ``` union U3 { signed f1 ; struct S0 f0; }; . . . clang -g -O3 -o test_O3.out test.c (lldb) file test_O3.out (lldb) b 38 (lldb) r [...] lldb) p global_union (volatile U3) { f1 = -1 f0 = (f0 = -1, f1 = 0) } (lldb) mem read &global_union 0x555555558030: ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0x555555558040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ ``` The last wo bytes `79 20` changed to `00 00`
shafik commented 2 weeks ago

Cam you provide a minimal reproducer in the form of two godbolt links for each example. Mainly to see the initialization and if we can observe this when displaying the value or not or does this only show up in the debugger.

edumoot commented 2 weeks ago

Thank you, @shafik, for the remainder. It’s updated. The only difference in the source code is that line 7 (struct S0 f0;)and line 8 (signed f1 ;) have been swapped, with everything else remaining the same.

Inspecting Godbolt1, we can get @global_union = internal global %union.U3 { %struct.S0 { i32 -1, i16 8313 } }, align 4 for union U3 { struct S0 f0; signed f1 ; };

For Godblot2, it gives "error: excess elements in scalar initializer", but we can still get @global_union = internal global { i32, [4 x i8] } { i32 -1, [4 x i8] undef }, align 4 for union U3 { signed f1 ; struct S0 f0; }; if we compile it directly in LLVM context.