Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

-fno-zero-initialized-in-bss -fno-common and tentative definitions #30598

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR31625
Status NEW
Importance P normal
Reported by hstong@ca.ibm.com
Reported on 2017-01-12 22:25:41 -0800
Last modified on 2019-01-29 08:08:25 -0800
Version trunk
Hardware PC Linux
CC erich.keane@intel.com, llvm-bugs@lists.llvm.org, neeilans@live.com, richard-llvm@metafoo.co.uk
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Clang's support for -fno-zero-initialized-in-bss does not match GCC's.
In particular, when using -fno-common, tentative definitions are still placed
by GCC into BSS.

Online compiler:
http://melpon.org/wandbox/permlink/76ugV7cPaxxqjIMl

### Source (<stdin>):
int x;

### Compiler invocation:
clang -c -o a.o -x c -fno-common -fno-zero-initialized-in-bss -

### Additional commands:
objdump -wt a.o | grep -P '\b''x\b'

### Expected output:
0000000000000000 g     O .bss   0000000000000004 x

### Actual output:
0000000000000000 g     O .data  0000000000000004 x

### clang -v:
clang version 4.0.0 (trunk 290110)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/llvm-head/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.3
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
Quuxplusone commented 5 years ago
This still exists in trunk today.  I believe the problem in part is that IR
CodeGen doesn't differentiate between an initalized and uninitialized variable.
See: https://godbolt.org/z/vnhOMG

Note that:
@i_uninit = dso_local global i32 0, align 4, !dbg !0
@i_init_zero = dso_local global i32 0, align 4, !dbg !6

BOTH are i32 0, despite one being initialized and one not.

CodeGenModule.cpp (CodeGenModule::EmitGlobalVarDefinition) seems to do this
intentionally (see the else if(!InitExpr) condition, ~3493).

The comment claims that this is intentional.  It seems to me that we could
replace the Init = line in that with llvm::UndefValue::get(D->getType()-
>getTypePtr());, however I'm not sure of the full consequences of that.

Additionally, some LLVM work would need to be done to correctly handle the bss
based on its init status.

Does anyone familiar with this code have guidance that they can give?  The test
failures of the above suggested changes (to get the Clang done) is a pretty
massive list, but I'm OK doing them if we believe this is the right thing.
Quuxplusone commented 5 years ago
Based on my reading of:
http://eel.is/c++draft/dcl.init#10 and http://eel.is/c++draft/basic.stc.static

I think my version above is illegal, right?  We presumably need some way to
identify which variables are "initialized in text" vs "initialized by rule" in
this case.
Quuxplusone commented 5 years ago
(In reply to Erich Keane from comment #2)
> Based on my reading of:
> http://eel.is/c++draft/dcl.init#10 and
> http://eel.is/c++draft/basic.stc.static
>
> I think my version above is illegal, right?  We presumably need some way to
> identify which variables are "initialized in text" vs "initialized by rule"
> in this case.
Yes, zero-initialization produces defined values that can be inspected.

As for the semantics, the "initialized by rule" portion seems to specifically
be the zero-initialization that is performed by [basic.start.static] in the
absence of constant initialization. An object that needs dynamic initialization
goes into BSS (if all bytes should be zero as the result of zero-
initialization).

All other constant initialization (including zero-initializing as part of value-
initialization) is not considered for BSS when -fno-zero-initialized-in-bss is
in effect.
Quuxplusone commented 5 years ago

Alright, thanks for the confirmation.

I'll discuss with my LLVM folks how we can communicate the difference between "initialized in text" vs "initialized by rule". At the moment, I don't see a way to do so in IR.