llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29k stars 11.95k forks source link

is_constant_evaluated can evaluate to false in array bound #99684

Open efriedma-quic opened 3 months ago

efriedma-quic commented 3 months ago
#include <type_traits>
int z[1/(1-std::is_constant_evaluated())];

clang prints the following, which is extremely confusing:

<source>:2:7: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
    2 | int z[1/(1-std::is_constant_evaluated())];
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:2:8: note: division by zero
    2 | int z[1/(1-std::is_constant_evaluated())];
      |        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:2:5: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
    2 | int z[1/(1-std::is_constant_evaluated())];
      |     ^
2 warnings generated.
llvmbot commented 3 months ago

@llvm/issue-subscribers-clang-frontend

Author: Eli Friedman (efriedma-quic)

``` #include <type_traits> int z[1/(1-std::is_constant_evaluated())]; ``` clang prints the following, which is extremely confusing: ``` <source>:2:7: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension] 2 | int z[1/(1-std::is_constant_evaluated())]; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <source>:2:8: note: division by zero 2 | int z[1/(1-std::is_constant_evaluated())]; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <source>:2:5: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant] 2 | int z[1/(1-std::is_constant_evaluated())]; | ^ 2 warnings generated. ```
cor3ntin commented 3 months ago

This is fun.

1/ We try to parse an array bound as a constant expression. Therefore, std::is_constant_evaluated() is true. However, 1/0 is UB so it is not a constant expression as UB in constant evaluation makes the expression non-constant.

What's an array declaration with a non-constant size? A VLA gasp (at least in C and we do the same thing in C++). So then:

2/ We reinterpret the meaning of the whole thing to be a VLA. Therefore, we are no longer in a constant expression context. So std::is_constant_evaluated() is false, the bound expression is well defined and voila, you accidentally declared a VLA whose non-constant but statically known size is 1.

3/ Clang is smart enough to not let that be VLAs survive if the whole thing can be constant folded, so we get out of this whole journey with a perfectly well defined bog standard C array of size 1. (and of course constant folding does not happen in a constant evaluation context, so std::is_constant_evaluated() stays 0 at that stage)

So the warnings, are correct, and everything is working as intended (which is sort of amazing).

Users, however, might be traumatized for life, but I think this is on par for the course with VLAs.

Should we try to cleverly do something with is_constant_evaluated when parsing array bounds? We might land in a less consistent, weirder place. Hopefully the weird sequences of warning is enough to stop users to committing that code.

Well, what we probably should do is stop supporting VLA in C++ by default.

efriedma-quic commented 3 months ago

I think it's possible to make TryToFixInvalidVariablyModifiedType() do the evaluation as if it were a constant context? Or if we can't re-run the evaluation, we can just compute the value when we initially try to constant-evaluate the array bound, and store it in the VariableArrayType.

(We fixed things a while back so that we don't try to do this folding unless it's a context where variably modified types are illegal, so this shouldn't impact the behavior of valid code.)

cor3ntin commented 3 months ago

Some options:

mizvekov commented 3 months ago

Another option: UB in constant evaluation of an array bound should be an error, and not simply make the expression non-constant.

efriedma-quic commented 3 months ago

We remember that we called is_constant_evaluated in the array bound and memoized its value. This would be a reasonable design but the outcome is that code is a VLA which is UB. Not helpful in this specific case

For this case, we just want to print an error. Actually, I think it's only possible to trigger this issue with invalid code.

If you instead had something like void f() { int x[1/(1-std::is_constant_evaluated())]; }, then we just treat it as a VLA; we don't try to use TryToFixInvalidVariablyModifiedType() to constant-fold it later.

Another option: UB in constant evaluation of an array bound should be an error, and not simply make the expression non-constant.

This ties the semantics of the code to the quality of the constant folder, which we don't really want.

mizvekov commented 3 months ago

This ties the semantics of the code to the quality of the constant folder, which we don't really want.

Though unlike the constant folder, the constexpr evaluator implements the language specification definition of what is UB, which wouldn't be affected by implementation defined behavior of the constant folder.

Otherwise, it would seem strange that we define std::is_constant_evaluated() to true when constant folding, as that would tie the semantics to the quality of the constant folder as well.

efriedma-quic commented 3 months ago

The reason TryToFixInvalidVariablyModifiedType exists in the first place is to try to handle cases where the user writes something in a context that only allows constant arrays, but isn't technically a constant because it depends on some form of constant evaluation not defined by the standard. We inherited this behavior from ancient versions of gcc.

So in this mode, we're effectively extending the definition of a "constant expression", not doing runtime evaluation, so is_constant_evaluated() should be true.

In places where we allow real variably modified types, we want to do something close to the C standard definition of a VLA. According to the C definition, if the array bound is an ICE, it's a constant array, and otherwise, it's a VLA. This includes cases where the expression is not an ICE due to undefined behavior.

mizvekov commented 3 months ago

I see. Thanks for the explanation.

Though by your account, when we extend constant expression evaluation, and also when std::is_constant_evaluated() is true while evaluating this extended constant expression, we are already compounding this problem of tying semantics to the quality of our constant folding.

So my question is, would it be a good solution to first perform a non-extended constexpr evaluation, error out in case of UB, and then proceed with any extended stuff, while std::is_constant_evaluated() evaluates to false from this stage forward?

efriedma-quic commented 3 months ago

The reason this code is so messy in the first place is that when we first see the type, we're not certain whether it's going to be used in a context that allows VLAs. If it is such a context, we can't print an error; that would reject valid code. At most, we could print a warning, and add a marking to the VLA type indicating it can't be "fixed" to a constant array type. And if we do that, it's basically equivalent to saving the evaluated value.

mizvekov commented 3 months ago

I see, thanks again.

But I don't see this example of code which contains UB in array bound, which is accepted by GCC and we must accept as well for special reasons.

Is this something already in the test suite, or can we add some of these test cases in this PR?

efriedma-quic commented 3 months ago

Simple example both gcc and clang accept:

void f() { int x[1/0];}
mizvekov commented 3 months ago

That's awful, it looks like breaking that would be a good thing.

cor3ntin commented 3 months ago

I have been thinking about that some more. Making it unconditionally ill-formed is problematic because of conformance. (We might want to ask WG21 to consider making some scenarios a hard error)

But, I think we could explore a warning along the line of constant initialization of 'x' would result in undefined behavior / x will be treated as a VLA because the constant evaluation of its size expression was undefined behavior (with the notes) - in its own warning group, enabled by default. If that prove successful we can then consider making it an error by default in non-pedantic modes in a follow up.

@AaronBallman WDYT?

AaronBallman commented 3 months ago

I have been thinking about that some more. Making it unconditionally ill-formed is problematic because of conformance. (We might want to ask WG21 to consider making some scenarios a hard error)

But, I think we could explore a warning along the line of constant initialization of 'x' would result in undefined behavior / x will be treated as a VLA because the constant evaluation of its size expression was undefined behavior (with the notes) - in its own warning group, enabled by default. If that prove successful we can then consider making it an error by default in non-pedantic modes in a follow up.

@AaronBallman WDYT?

I think that notes explaining what's going on is pretty reasonable to explore; it certainly seems better than what we have currently.

shafik commented 2 months ago

Note the standard has similar mind bending examples here: https://eel.is/c++draft/expr.const#20.5

template<bool> struct X {};
X<std::is_constant_evaluated()> x;                      // type X<true>
int y;
const int a = std::is_constant_evaluated() ? y : 1;     // dynamic initialization to 1
double z[a];                                            // error: a is not usable
                                                        // in constant expressions
const int b = std::is_constant_evaluated() ? 2 : y;     // static initialization to 2
int c = y + (std::is_constant_evaluated() ? 2 : y);     // dynamic initialization to y+y

constexpr int f() {
  const int n = std::is_constant_evaluated() ? 13 : 17; // n is 13
  int m = std::is_constant_evaluated() ? 13 : 17;       // m can be 13 or 17 (see below)
  char arr[n] = {}; // char[13]
  return m + sizeof(arr);
}
int p = f();                                            // m is 13; initialized to 26
int q = p + f();                                        // m is 17 for this call; initialized to 56

If we don't have those in tests someplace, we should.

shafik commented 2 months ago

Ironically the original example is rejected by gcc, even though I think clang is correct to accept this given we support VLAs as an extension ostensibly to be GNU compatible: https://godbolt.org/z/hd1jsGfsW

efriedma-quic commented 2 months ago

Rejecting the original example is correct, I think.


The following also gives some weird warnings; we warn about divide by zero even though we don't actually divide by zero, and we warn that "std::is_constant_evaluated" will return true even though it actually returns false.

#include <type_traits>
int zz = 1/(1-std::is_constant_evaluated());
int z = 1/(std::is_constant_evaluated());
<source>:2:15: warning: 'std::is_constant_evaluated' will always evaluate to 'true' in a manifestly constant-evaluated expression [-Wconstant-evaluated]
    2 | int zz = 1/(1-std::is_constant_evaluated());
      |               ^
<source>:3:10: warning: division by zero is undefined [-Wdivision-by-zero]
    3 | int z = 1/(std::is_constant_evaluated());
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:3:12: warning: 'std::is_constant_evaluated' will always evaluate to 'true' in a manifestly constant-evaluated expression [-Wconstant-evaluated]
    3 | int z = 1/(std::is_constant_evaluated());
      |            ^
3 warnings generated.
shafik commented 2 months ago

Rejecting the original example is correct, I think.

So the diagnostic are not great as they are emitted now but the result feels consistent with the examples from [expr.comst]p20.5

If we would have undefined behavior then it can not be constant evaluated and therefore std::is_constant_evaluated() should result in false that would make the original case a VLA.