llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.42k stars 12.16k forks source link

__builtin_assume(false) should not be a constant expression #40658

Open davidstone opened 5 years ago

davidstone commented 5 years ago
Bugzilla Link 41313
Version trunk
OS Linux
CC @dwblaikie,@hfinkel,@jdoerfert,@zygoloid

Extended Description

I would expect this static_assert to fail:

static_assert((void(__builtin_assume(false)), true));

I believe this change would fit into the constexpr model better. constexpr evaluation is required to diagnose undefined behavior, and __builtin_assume states that if the condition is false, the behavior is undefined.

davidstone commented 3 years ago

mentioned in issue llvm/llvm-bugzilla-archive#44979

hfinkel commented 4 years ago

So, there's a secret choice made by the frontend: either we think a __builtin_assume's operand is "safe" to evaluate, in which case we emit code for it and evaluate it at runtime, or we think it's "unsafe", in which case we ignore it entirely. There is no sense in which we actually evaluate the operand ignoring side effects in practice; we just pretend that's what we're doing.

I think the only reasonable way to resolve this and llvm/llvm-bugzilla-archive#44979 would be to stop treating that as a secret: document that __builtin_assume will actually emit and evaluate its operand if we think it's safe (in particular, if we think it's side-effect-free) and will have undefined behavior if it evaluates to 'false', whereas we will not emit it and will always have defined behavior if we think it's unsafe.

Then we can do the exact same check that IR generation will do from the constant evaluator and in UBSan, and consistently trap the case where it's actually undefined. The downside is that we can no longer hide behind the simple explanation that the operand is never evaluated.

But that may be for the best. For example, observe that

int main() { __builtin_assume( ({ while(true); true; }) ); }

... actually emits an infinite loop, despite our pretending that we don't evaluate the operand of __builtin_assume; we think it's "safe" because it doesn't have side-effects.

Also, the fact that we do this check (and completely ignore __builtin_assumes with unsafe operands) is critically important to users of the builtin, as it affects how you need to write the operand in order for Clang to not ignore it, but is currently entirely undocumented.

We've discussed (see, e.g., http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html) an LLVM mechanism to express assumes with side effects, where the backend will ensure that all of the side effects are properly ignored. Under such a scheme, I believe that this while-loop case will work correctly (in the sense that there will be no loop).

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 4 years ago

So, there's a secret choice made by the frontend: either we think a __builtin_assume's operand is "safe" to evaluate, in which case we emit code for it and evaluate it at runtime, or we think it's "unsafe", in which case we ignore it entirely. There is no sense in which we actually evaluate the operand ignoring side effects in practice; we just pretend that's what we're doing.

I think the only reasonable way to resolve this and llvm/llvm-bugzilla-archive#44979 would be to stop treating that as a secret: document that __builtin_assume will actually emit and evaluate its operand if we think it's safe (in particular, if we think it's side-effect-free) and will have undefined behavior if it evaluates to 'false', whereas we will not emit it and will always have defined behavior if we think it's unsafe.

Then we can do the exact same check that IR generation will do from the constant evaluator and in UBSan, and consistently trap the case where it's actually undefined. The downside is that we can no longer hide behind the simple explanation that the operand is never evaluated.

But that may be for the best. For example, observe that

int main() { __builtin_assume( ({ while(true); true; }) ); }

... actually emits an infinite loop, despite our pretending that we don't evaluate the operand of __builtin_assume; we think it's "safe" because it doesn't have side-effects.

Also, the fact that we do this check (and completely ignore __builtin_assumes with unsafe operands) is critically important to users of the builtin, as it affects how you need to write the operand in order for Clang to not ignore it, but is currently entirely undocumented.

davidstone commented 4 years ago

UBSan has a similar limitation: llvm/llvm-bugzilla-archive#44979

hfinkel commented 5 years ago

This is tricky: the operand of __builtin_assume might have side-effects or otherwise be non-constant, and we may not want constant evaluation to reject such cases.

Interesting. We do ignore side effects when we actually hit CodeGen, so ignoring them here likely makes sense too.

I suppose we could try speculatively evaluating the operand (disallowing external side-effects), and reject the overall evaluation if we manage to evaluate the operand and it evaluates to false, but otherwise ignore it. That still wouldn't give the desired guarantee, though.

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 5 years ago

This is tricky: the operand of __builtin_assume might have side-effects or otherwise be non-constant, and we may not want constant evaluation to reject such cases.

I suppose we could try speculatively evaluating the operand (disallowing external side-effects), and reject the overall evaluation if we manage to evaluate the operand and it evaluates to false, but otherwise ignore it. That still wouldn't give the desired guarantee, though.

davidstone commented 5 years ago

Note that

static_assert((void(__builtin_unreachable()), true));

is rejected as not a constant expression.

davidstone commented 1 year ago

The implementation of [[assume(false)]] in gcc causes it to reject during constant evaluation. https://godbolt.org/z/xebYGPPfG

davidstone commented 7 months ago

clang also rejects [[assume(false)]] in constant evaluation. It's a bit surprising that __builtin_assume and assume behave differently, but given that we now have [[assume(condition)]] the need for this fix is much less.

shafik commented 7 months ago

Some extra context on assume: https://discourse.llvm.org/t/rfc-builtin-assume-assume-optimization-behavior/76943