llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.98k stars 11.55k forks source link

[clang] compound literals within local structs should not require constant initialization #107562

Open ghost opened 1 week ago

ghost commented 1 week ago

Compiler Explorer

// [clang] error: initializer element is not a compile-time constant
// [ gcc ] ok

/* const */ size_t i;

         static_assert (!! sizeof ((size_t) {i}));    // OK
struct { static_assert (!! sizeof ((size_t) {i})); }; // KO
llvmbot commented 1 week ago

@llvm/issue-subscribers-clang-frontend

Author: Sofian Touhami (swordkill)

**Hi Humans,** > Last year I shared an issue related to VLA and Blocks (extension) : https://github.com/llvm/llvm-project/issues/63412 I come back for one more embarrassing, related either to C99 compound literals, or C11 _Generics or _Static_assert. Here is a reproduction : https://godbolt.org/z/9dTa4YP4n ### The purpose is to be able to use static assertion within an expression. ```c /* ** Using -std=c23. ** ** The trick is that static_assert is garanteed by the standard ** to not produce anything : ** ** - https://en.cppreference.com/w/c/language/_Static_assert ** ** """Otherwise, ** if expression does not equal zero, nothing happens; ** no code is emitted. ** """ **/ #define static_expr(e) \ ( !! sizeof (struct {static_assert (e); char c;}) ) #define is_same(e1, e2) \ _Generic (&(typeof_unqual (e1)) {} \ , typeof_unqual (e2) * : true \ , default : false \ ) ``` ### The issue ```c /* ** It's OK. **/ typeof ((size_t) {i}) witness_one; typeof_unqual ((size_t) {i}) witness_two; /* ** It's OK. **/ is_same (witness_one, (size_t) {i}); is_same (witness_two, (size_t) {i}); /* ** It's OK. **/ static_assert (is_same (witness_one, (size_t) {i})); /* ** It's KO. ** [clang] error: initializer element is not a compile-time constant ** [gcc ] everything is good buddy **/ static_expr (is_same (witness_one, (size_t) {i})); ```
shafik commented 1 week ago

CC @AaronBallman

AaronBallman commented 6 days ago

A reduced equivalent reproducer is:

void foo() {
  int i = 1;    

  struct {
    int array[sizeof((int){i})];
  } s;
}

https://godbolt.org/z/66E96oqq4

As pointed out above, the struct matters: https://godbolt.org/z/4h45zsxcd (note the lack of a -Wvla diagnostic, so it's not that we're rejecting because we introduce a VLA into a structure or something along those lines). But also, whether it's declared at file scope or not also matters: https://godbolt.org/z/qc1KfYzGM

My intuition is that this is a Clang bug and we should accept the original code, but I've not yet investigated what the standard says.

AaronBallman commented 6 days ago

C23 6.5.3.6p3: All the constraints for initializer lists in 6.7.11 also apply to compound literals. C23 6.7.11p5: All the expressions in an initializer for an object that has static or thread storage duration or is declared with the constexpr storage-class specifier shall be constant expressions or string literals.

So we're fine there (nothing else in 6.7.11's constraints really applies). I don't see any further constraints in 6.5.3.6 that would justify Clang's behavior, so confirming the issue.

llvmbot commented 6 days ago

@llvm/issue-subscribers-c

Author: Sofian Touhami (SwordKill)

**Hi Humans,** > Last year I shared an issue related to VLA and Blocks (extension) : https://github.com/llvm/llvm-project/issues/63412 I come back for one more embarrassing, related either to C99 compound literals, or C11 _Generics or _Static_assert. Here is a reproduction : https://godbolt.org/z/9dTa4YP4n ### The purpose is to be able to use static assertion within an expression. ```c /* ** Using -std=c23. ** ** The trick is that static_assert is garanteed by the standard ** to not produce anything : ** ** - https://en.cppreference.com/w/c/language/_Static_assert ** ** """Otherwise, ** if expression does not equal zero, nothing happens; ** no code is emitted. ** """ **/ #define static_expr(e) \ ( !! sizeof (struct {static_assert (e); char c;}) ) #define is_same(e1, e2) \ _Generic (&(typeof_unqual (e1)) {} \ , typeof_unqual (e2) * : true \ , default : false \ ) ``` ### The issue ```c /* ** It's OK. **/ typeof ((size_t) {i}) witness_one; typeof_unqual ((size_t) {i}) witness_two; /* ** It's OK. **/ is_same (witness_one, (size_t) {i}); is_same (witness_two, (size_t) {i}); /* ** It's OK. **/ static_assert (is_same (witness_one, (size_t) {i})); /* ** It's KO. ** [clang] error: initializer element is not a compile-time constant ** [ gcc ] everything is good buddy **/ static_expr (is_same (witness_one, (size_t) {i})); ```
zygoloid commented 6 days ago

I don't have C23 to hand, but in the draft I do have, I found this:

6.5.3.6/3: If the compound literal occurs outside the body of a function, the object has static storage duration; otherwise, it has automatic storage duration associated with the enclosing block.

but conversely:

6.2.4/6: If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.

... which together seem to mean that the compound literal isn't required to have a constant initializer in this case, but also that its initializer is never actually evaluated at runtime either. Which is weird, but OK I guess, given that presumably it can only occur in things like sizeof or typeof anyway.

That behavior is also inconsistent:

int n;

void f() {
  struct A {
    // OK.
    int array[sizeof((int){n})];
  };
}

struct B {
  // Ill-formed.
  int array[sizeof((int){n})];
};

... but accepting A and rejecting B is what GCC does, and what the C standard seems to ask for. shrug

AaronBallman commented 4 days ago

How dare you change my concise title? You're like those recent anime that spoil the entire story in the title! Shame on you!

Please familiarize yourself with our Code of Conduct. For what it's worth, we frequently reword issue titles (and summaries) so that it's easier for community members to understand the issue without having to read through as many comments.

Also, did you guys know that the C23 specification for the auto keyword is ambiguous? I mean, clearly it already behaves differently between Clang and GCC, with the GCC version being more restricted.

This is unrelated to the original issue; you should file a separate issue if you think Clang has a bug, otherwise it's too easy for things to get lost in discussion.

Well, this is off-topic,

It is. :-)

AaronBallman commented 4 days ago

I don't have C23 to hand, but in the draft I do have, I found this:

6.5.3.6/3: If the compound literal occurs outside the body of a function, the object has static storage duration; otherwise, it has automatic storage duration associated with the enclosing block.

We actually changed that wording pretty significantly during NB comment resolution. https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf is an early C2y draft and will be closer to the final wording for C23.

In C23, it says:

6.5.3.6 (Constraints) p3: All the constraints for initializer lists in 6.7.11 also apply to compound literals. p4: If the compound literal is associated with file scope or block scope (see 6.2.1) the storage-class specifiers SC (possibly empty), type name T, and initializer list, if any, shall be such that they are valid specifiers for an object definition in file scope or block scope, respectively, of the following form,

  SC typeof(T) ID = { IL };

where ID is an identifier that is unique for the whole program and where IL is a (possibly empty) initializer list with nested structure, designators, values and types as the initializer list of the compound literal. All the constraints for storage-class specifiers in 6.7.2 also apply correspondingly to compound literals. If the compound literal is associated with function prototype scope, constraints as if in block scope apply.

(Semantics) p7: If the storage-class specifiers are absent or contain constexpr, static, register, or thread_local the behavior is as if the object were declared and initialized in the corresponding scope with these storage-class specifiers; if another storage-class specifier is present, the behavior is undefined. If the storage-class specifier constexpr is present, the initializer is evaluated at translation time. Otherwise, if the storage duration is automatic, the initializer is evaluated at each evaluation of the compound literal; if the storage duration is static or thread the initializer is (as if) evaluated once prior to program startup.

I believe the compound literal in this case is actually at block scope (6.2.1 only gives us function, file, block, and function prototype scopes). 6.2.1p8: A compound literal (which is an expression that provides access to an anonymous object) is associated with the scope of the type name used in its definition; that scope is either file scope, function prototype scope, or block scope.

but conversely:

6.2.4/6: If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.

This also changed somewhat, that bit now reads: If an initialization is specified for the object and it is not specified with constexpr, it is performed each time the declaration or compound literal is reached in the execution of the block; if it is specified with constexpr the initializer is evaluated once at translation time and the new instance of the object is initialized to that fixed value each time the specification is reached; otherwise, the representation of the object becomes indeterminate each time the declaration is reached.

... which together seem to mean that the compound literal isn't required to have a constant initializer in this case, but also that its initializer is never actually evaluated at runtime either. Which is weird, but OK I guess, given that presumably it can only occur in things like sizeof or typeof anyway.

I think the way this works is that the compound literal is evaluated because it's at block scope. The notional declaration of the unnamed compound literal object is at block scope and so when that block is executed, presumably the initialization happens at wherever that notional object was declared given that it was inside of a struct body. I think the standard could be made more clear here. :-)

That behavior is also inconsistent:

int n;

void f() {
  struct A {
    // OK.
    int array[sizeof((int){n})];
  };
}

struct B {
  // Ill-formed.
  int array[sizeof((int){n})];
};

... but accepting A and rejecting B is what GCC does, and what the C standard seems to ask for. shrug

I think your analysis and GCC's behavior are both correct. In the first one, the compound literal is at block scope and has automatic storage duration. In the second one, the compound literal is at file scope and has static storage duration.