Open Halalaluyafail3 opened 4 months ago
@llvm/issue-subscribers-clang-frontend
Author: None (Halalaluyafail3)
CC @AaronBallman
The return
statement in foo()
is what determines the value that is returned, and that is an rvalue (C23 6.8.7.5p3) of the unnamed structure type, which is unqualified. So if we're doing anything wrong, I think it has more to do with the .
operator. C23 6.5.3.4p3 says:
A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member,93) and is an lvalue if the first expression is an lvalue. If the first expression has qualified type, the result has the so-qualified version of the type of the designated member.
So the member named here is not an lvalue because the returned structure object is not an lvalue. However, "if the first expression has qualified type" can be read to apply here and Clang's behavior is correct. I took a look at what other C compilers do and the results are wonderful:
Clang, Chibicc, EDG: do not strip the qualifier GCC, MSVC, TCC, CompCert: strips the qualifier SDCC: claims there are multiple matching associations
So I'm going to ask on the WG14 reflectors to see what folks think.
I lied, I didn't need to go ask WG14. _Generic's controlling operand behaves as though lvalue conversion happens and that needs to drop the qualifiers anyway.
I lied again and have asked on WG14 because I think this is a standards wording defect or I'm reading the standard wrong. lvalue conversion only applies to lvalues, so if we do get an rvalue from the member access expression, then the lvalue conversion wouldn't drop qualifiers because no conversion occurs.
https://godbolt.org/z/W64hxa4jj shows the compiler divergence for this.
Interestingly, it seems that all compilers agree that with an array member qualifiers should not be removed in this situation (after decay it is no longer a qualified rvalue, but it being a qualified rvalue can still be observed with typeof). Also, I have discovered that GCC does not remove qualifiers from the result of .
with an rvalue left operand in situations where lvalue conversions do not apply. That is, using the example typeof(foo().c)
is the type const int
but typeof(0,foo().c)
is the type int
. I'm thinking about making a bug report to GCC for that since it seems strange for an rvalue to undergo lvalue conversions.
Committee discussion on the WG14 reflectors suggests a few things: 1) qualified rvalues are a thing in C and that was a benign thing largely because qualifiers were never observable until _Generic
, but now means we need more clarity in the standard; 2) _Generic
should still strip those qualifiers even if it gets an rvalue. (So the typeof
behavior observing the qualifiers on the rvalue is correct.)
If typeof can be used to observe the qualifiers and qualified rvalues exist, then that creates situations like the following:
struct{
const int c;
}foo();
typeof(+foo().c)x;
x=1;
Here I see three possible results:
+
operator is not an arithmetic type (it is a qualified arithmetic type). Strictly reading the standard, I think this would be correct interpretation.x=1
is assigning to a const object. This is the interpretation that Clang uses.x
here is int
. This is the interpretation that GCC uses.I don't think that program violates a constraint. C23 6.2.5p31:
... The qualified or unqualified versions of a type are distinct types that belong to the same type category and have the same representation and alignment requirements. ...
I think Clang's handling of that is correct because the function returns an rvalue, the .
operator yields an lvalue of type const int
, and the unary +
operator only does integer promotions (which don't include lvalue conversion but could change the type) so the result is still has type const int
, and that is the operand to typeof
, which does not undergo lvalue conversion.
A slightly different example would be:
struct{
const short c;
}foo();
typeof(+foo().c)x;
x=1;
where I believe you would generally get int
and not const short
because of integer promotions:
If the original type is not a bit-precise integer type (6.2.5): if an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int;50) otherwise, it is converted to an unsigned int.
And... Clang seems to be doing exactly that: https://godbolt.org/z/ndsjo9E7Y
I don't think that program violates a constraint. C23 6.2.5p31:
... The qualified or unqualified versions of a type are distinct types that belong to the same type category and have the same representation and alignment requirements. ...
The term 'type category' is defined as such:
A type is characterized by its type category, which is either the outermost derivation of a derived type (as noted previously in this subclause in the construction of derived types), or the type itself if the type consists of no derived types.
Section 6.2.5 "Types" Paragraph 30 N3220
From how I read this, the type category of const int
is int
. I don't see how that makes const int
included in arithmetic types. Additionally, the standard uses the the term 'qualified or unqualified character type' (in the definition of va_arg) which wouldn't make sense if this was the case as it would be redundant.
I think Clang's handling of that is correct because the function returns an rvalue, the . operator yields an lvalue of type const int, and the unary + operator only does integer promotions (which don't include lvalue conversion but could change the type) so the result is still has type const int, and that is the operand to typeof, which does not undergo lvalue conversion.
I assume you mean that the .
operator yields an rvalue here, if it didn't then you could take the address of it which Clang doesn't allow. Following your line of thinking that const int
is included in arithmetic types, and that integer promotions don't change the type so that it remains a qualified rvalue; shouldn't the following also be invalid?
struct{
const int c;
}foo();
typeof(foo().c+foo().c)x;
x=1;
The types of the operands to the +
(binary) operator here are both const int
and the usual arithmetic conversions are applied.
Otherwise, if any of the two types is an enumeration, it is converted to its underlying type. Then, the integer promotions are performed on both operands. Next, the following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Section 6.3.1.8 "Usual arithmetic conversions" Paragraph 1 N3220
Using this, the result of the addition should be const int
yet Clang results in int
anyways. Also, the result of foo().c+0
would be unclear.
From how I read this, the type category of
const int
isint
. I don't see how that makesconst int
included in arithmetic types. Additionally, the standard uses the the term 'qualified or unqualified character type' (in the definition of va_arg) which wouldn't make sense if this was the case as it would be redundant.
Can you imagine an interpretation where int
is an integer type and const int
is not an integer type? However, I think we're off in the weeds -- the standard is very unclear on these points, and that's the whole point to this issue.
I assume you mean that the . operator yields an rvalue here.
Yes, sorry for the confusion!
Using this, the result of the addition should be const int yet Clang results in int anyways. Also, the result of foo().c+0 would be unclear.
I get the same interpretation as you do.
IMO, this topic requires a WG14 paper to tease out what we want the standard to say. The author of the typeof
proposal was clearly looking for typeof(foo().c)
to be const int
according to discussions both on and off list, and that matches the intent of the paper. The point to typeof
is "I want the declared type of the thing being inspected" while typeof_unqual
is "I want the non-atomic, unqualified type of the thing being inspected", but there's obviously a lot of edge cases involved. Another example would be C's adjustments to function types: https://godbolt.org/z/dzeWMz563 (which Clang doesn't correctly handle currently: https://github.com/llvm/llvm-project/issues/39494)
IMO, this topic requires a WG14 paper to tease out what we want the standard to say. The author of the
typeof
proposal was clearly looking fortypeof(foo().c)
to beconst int
according to discussions both on and off list, and that matches the intent of the paper. The point totypeof
is "I want the declared type of the thing being inspected" whiletypeof_unqual
is "I want the non-atomic, unqualified type of the thing being inspected", but there's obviously a lot of edge cases involved. Another example would be C's adjustments to function types: https://godbolt.org/z/dzeWMz563 (which Clang doesn't correctly handle currently: #39494)
I think a good solution would be to convert to the type after lvalue conversions, if the expression was used in a context where lvalue conversions would apply had the expression been an lvalue instead. Similar to your proposed solution for this with _Generic but more generally. This would still allow typeof(foo().c)
to be const int
, applying .
again to propagate qualifiers, and sizeof
to properly determine the size of atomic members while not requiring consideration of how qualifiers pass through operators which I don't think was intended. Consider the case of atomics:
struct{
_Atomic int c;
}foo(),bar;
+bar.c;//OK, lvalue conversions remove _Atomic
+foo().c;//Clang rejects
Clang rejects this because +
requires an arithmetic type and _Atomic int
is not considered an arithmetic type. Having this be valid on an lvalue but not on a non-lvalue expression doesn't really make much sense (this situation also happens with casts and function calls which result in atomic types as well).
The following code demonstrates the issue:
foo
returns a structure that contains a qualified member namedc
of typeconst int
. Callingfoo
results in an rvalue structure, which when accessing one of the members via.
results in an rvalue. Clang does not remove qualifiers or _Atomic (C23 removes _Atomic from rvalues as well, and GCC does this in earlier versions as well) here while GCC does and the result of DR 423 seems intended to remove cases of qualified rvalues. Additionally, Clang actually generates a warning here with the default warnings stating that theconst int
branch will never be taken (and then it takes it anyway). The wording doesn't state that the result is the non-atomic (C23) and unqualified member when getting the member though I assume that is a defect.