Open 1feda113-18e2-42e3-8624-9c17d4d32ec3 opened 4 years ago
I see, I was actually looking at my example (https://godbolt.org/z/YuqLwu).
Interestingly, we know the facts at some point as this O0
vs O1
comparison shows: https://godbolt.org/z/WMtygk (the llvm.assumes are folded away as they are trivial). Maybe the pass removing them knows/uses the control conditions but a following one does not?
The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.
I'm not sure how we can eliminate the loop. We already use the fact that count is a multiple of 16 (or 8 actually).
Compare the assembly from https://godbolt.org/z/Em3ZXP . If the trip count is a multiple of the vector factor, we'll use the vectorized loop for all iterations, and we won't use the scalar tail. Therefore, given a vector factor used, the scalar tail is dead. LLVM figures that out on the right side, but not the left.
The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.
I'm not sure how we can eliminate the loop.
We already use the fact that count is a multiple of 16 (or 8 actually).
Do we though? https://godbolt.org/z/QQi6GC suggests we don't. As another symptom, we fail to recognize that loads are aligned.
The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.
I'm not sure how we can eliminate the loop. We already use the fact that count is a multiple of 16 (or 8 actually).
The problem is that std::terminate can have any other side effect as well: Tell clang that is not going to happen and you get what you want: https://godbolt.org/z/YuqLwu
The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.
The problem is that std::terminate can have any other side effect as well: Tell clang that is not going to happen and you get what you want: https://godbolt.org/z/YuqLwu
I mean, if std::terminate prints a message, how should it be equivalent to __builtin_unreachable
?
Now, the stuff after a noreturn
call is always ignored as it is dead per definition. There is no need to have a __builtin_unreachable
call there to indicate the program point is unreachable, we already know and therefore removed the call. There is also no reasonable way to change that short of ignoring noreturn
on std::terminate
.
We do, in fact, assume that terminate()
doesn't return. If you look at the LLVM IR, that should be obvious.
The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.
https://godbolt.org/z/G27wEN Calling [[noreturn]] function is identical to the __builtin_unreachable().
The thing is that std::terminate() isn't identical to __builtin_unreachable(), it actually has to be called and will terminate the program, while __builtin_unreachable() simply tells the compiler that this branch won't ever be taken, with zero indication of the error if it turns out to be actually taken during runtime.
I am not talking about them being equivalent.
I am saying that the compiler ought to hard-assume that [[noreturn]]
functions never return.
Right now, placing a __builtin_unreachable()
after a std::terminate()
causes the __builtin_unreachable()
to be ignored, which is not helpful:
https://godbolt.org/z/G27wEN
Calling [[noreturn]]
function is identical to the __builtin_unreachable()
.
The thing is that std::terminate()
isn't identical to __builtin_unreachable()
,
it actually has to be called and will terminate the program,
while __builtin_unreachable()
simply tells the compiler that this branch
won't ever be taken, with zero indication of the error if it turns out to be actually taken during runtime.
Apparently from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95001 I was not clear what I was asking, so let me post what I commented there here as well:
Just to clarify what I'm asking for:
Calling a [[noreturn]]
function ought to have the same effects on codegen as:
[[noreturn]] void theend();
...
if(a)
{
theend();
__builtin_unreachable();
}
Right now, the [[noreturn]]
attribute causes the __builtin_unreachable()
to be ignored, which is not helpful.
I think there should be an implicit __builtin_unreachable()
if a [[noreturn]]
function ever would return. To be specific here: I only care about codegen.
If that's not doable, can you please make a __builtin_unreachable()
called after a [[noreturn]]
function not be ignored please?
Extended Description
Consider the codegen from https://godbolt.org/z/Em3ZXP:
It would seem that functions marked with both
[[noreturn]]
andnoexcept
do not have the same improvements on codegen as__builtin_unreachable()
has. This is despite that[[noreturn]]
functions returning is explicitly required to be UB in the standard, and if they arenoexcept
then they cannot throw an exception either.Can
noexcept
functions marked[[noreturn]]
please gain the same effects on codegen as__builtin_unreachable()
has please?