Open BenCStokes opened 1 year ago
Not sure about the UB rule here, but the reason x+=1 doesn't happen is I think that OPT (which notices the x 'uninit' value) is getting confused by the integer promotion.
If you instead replace that with a 'unsigned int', you get unreachable.
By my reading of the C standard, this is unspecified behavior (so a different U than the usual meaning for UB). All references below are to C2x N3047.
6.2.4p6: ... The initial representation of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the representation of the object becomes indeterminate each time the declaration is reached.
also 6.7.10p11: If an object that has automatic storage duration is initialized with an empty initializer, its value is the same as the initialization of a static storage duration object. Otherwise, if an object that has automatic storage duration is not initialized explicitly, its representation is indeterminate. ...
3.19.2: indeterminate representation object representation that either represents an unspecified value or is a non-value representation
3.19.3: unspecified value valid value of the relevant type where this document imposes no requirements on which value is chosen in any instance
3.19.4: non-value representation an object representation that does not represent a value of the object type
6.2.6.1p6: (does not apply because "does not have character type") Certain object representations need not represent a value of the object type. If such a representation is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined55). Such a representation is called a non-value representation.
This leaves "unspecified value: as the only possible indeterminate representation, so: 3.4.4: unspecified behavior behavior, that results from the use of an unspecified value, or other behavior upon which this document provides two or more possibilities and imposes no further requirements on which is chosen in any instance
However, it's very much worth noting that DR260 and DR451 are both involved and that's somewhat of an open area of discussion (around "wobbly values") and TS 6010 is relevant, so I would caution against making changes until it's more clear what direction the C committee is going.
As an extra note C++ makes this undefined behavior as per basic.indet p2, it carves out some exceptions for unsigned char
and std::byte
to produces indeterminate values in some cases but not this one.
@llvm/issue-subscribers-c
Clang, with -O1 or higher, seems to think the following C code invokes undefined behaviour. The IR output given by
clang -std=c17 -O1 -emit-llvm -S
hasmain
as justunreachable
.My understanding is that this is not UB under C17 because the address of
x
is taken andunsigned char
can't have trap representations, so the value ofx
is unspecified until the first increment. Interestingly, if you replace++x
withx+=1
, Clang compiles the code correctly, even though these are equivalent according to the standard. The same bug happens withx++
,--x
orx--
instead of++x
.