Closed comex closed 8 years ago
Sorry, hasty conclusion in the deleted comment. Why can't compilers optimize when one wants them too?
On architectures where NULL or 0.0 are not represented with all bits zero, it can be impossible to initialize both the second and the third member of the struct, even in the area where they both go beyond the width of the first member. Consider, on such a weird architecture:
{
union { short a; float f[100]; int i[100]; void *p[100]; } u = {0};
}
Which member should win the privilege of being initialized? The standard says a
. It cannot mean that f
, i
, p
are initialized because they cannot all be initialized.
On the examples I tried at gcc.godbolt.org, it seems that both GCC and Clang set the entire union to zero (and that happens to initialize all members of the union since the representations of NULL and 0.0 coincide on the architectures they target). I'm not sure this is something you want to rely on.
There is a good argument to be made that the bits after a
should be set to 0 as padding in C11. However the words that can be interpreted this way were one of a few discrete changes from C99 to C11 (C99 does not mention setting padding bits to 0), so you'd still want to treat this padding as uninitialized in case the code ever gets compiled with a C99 compiler.
In fact the sentence “if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;” in 6.7.9:10 strongly suggest that the bits of f
, i
and p
that go beyond the width of a
are padding. What padding would we be talking about otherwise?
Also examples by Alexander Cherepanov show that padding is not very stable. It's one of the many little inconsistencies in the C standard(s): the committee added explicit words in C11 to say what happens to padding at initialization, with the rationale that memcmp
will work better on structs with padding, but other words in the standard or in DRs can already be interpreted as saying that padding bits change freely without any action of the program. So what use is the new guarantee in C11 supposed to be?
Ref: https://github.com/TrustInSoft/tis-interpreter/issues/101#issuecomment-223332754 https://twitter.com/ch3root/status/742358182891257856
I have just played with initializers for unions several days ago (as related to the question "What is the value of a union?") and it seems that this area is underspecified in C11. If c
is considered a subobject then it should be possible to initialize it (by using a designator) at the same time as initializing a
, right? The example below shows that gcc and clang don't permit it. Not sure what is the right approach. In tis-interpreter it seem safer to assume that anything not explicitly initialized is indeterminate.
Somewhat related:
Source code:
#include <stdio.h>
int main() {
union { int a; struct { int b; int c; }; } u = {.c = 2, .a = 1};
printf("%d\n", u.c);
}
tis-interpreter (31be1ffdb350ea940095be4757d0d5779c38f10b) output:
test.c:4:[kernel] failure: Cannot find designated field c
[kernel] user error: stopping on file "test.c" that has errors. Add '-kernel-msg-key pp'
for preprocessing command.
[kernel] Frama-C aborted: invalid user input.
gcc (GCC) 7.0.0 20160616 (experimental):
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 -fsanitize=undefined test.c && ./a.out
test.c: In function ‘main’:
test.c:4:64: warning: initialized field overwritten [-Woverride-init]
union { int a; struct { int b; int c; }; } u = {.c = 2, .a = 1};
^
test.c:4:64: note: (near initialization for ‘u.a’)
0
clang version 3.9.0 (trunk 271312):
$ clang -std=c11 -Weverything -O3 -fsanitize=undefined test.c && ./a.out
test.c:4:60: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides]
union { int a; struct { int b; int c; }; } u = {.c = 2, .a = 1};
~^
test.c:4:51: note: previous initialization is here
union { int a; struct { int b; int c; }; } u = {.c = 2, .a = 1};
^~~~~~
1 warning generated.
0
tis-interpreter doesn't think so:
I suppose this paragraph suggests not: (C11 6.2.6.1.6)
But then: (C11 6.7.9.19)
Doesn't
c
in my example count as a subobject that should be initialized implicitly?(I ran into this warning with some code of mine that was buggy and wasn't supposed to be doing anything of the sort - the union members were supposed to be the same size. tis-interpreter found the bug, but I'm not sure it actually should have...)