`std::terminate()` and `abort()` do not have `__builtin_unreachable()` semantics

1feda113-18e2-42e3-8624-9c17d4d32ec3 commented 4 years ago


Bugzilla Link	45839
Version	trunk
OS	All
CC	@DougGregor,@efriedma-quic,@jdoerfert,@LebedevRI,@zygoloid

Extended Description

Consider the codegen from https://godbolt.org/z/Em3ZXP:

#include <cstddef>
#include <cstdint>
#include <cstdlib>
#include <exception>

void sum(uint32_t *__restrict a, const uint32_t *__restrict b, const uint32_t *__restrict c, size_t count)
{
    auto invoke_terminate = []{
#ifdef USE_UNREACHABLE
        __builtin_unreachable();
#else
        std::terminate();
#endif
    };
    if((((uintptr_t)a) & 15)!=0) invoke_terminate();
    if((((uintptr_t)b) & 15)!=0) invoke_terminate();
    if((((uintptr_t)c) & 15)!=0) invoke_terminate();
    if((count & 15) != 0) invoke_terminate();
    while(count != 0)
    {
        *a++ = *b++ + *c++;
        count--;
    }
}

It would seem that functions marked with both [[noreturn]] and noexcept do not have the same improvements on codegen as __builtin_unreachable() has. This is despite that [[noreturn]] functions returning is explicitly required to be UB in the standard, and if they are noexcept then they cannot throw an exception either.

Can noexcept functions marked [[noreturn]] please gain the same effects on codegen as __builtin_unreachable() has please?

jdoerfert commented 4 years ago

I see, I was actually looking at my example (https://godbolt.org/z/YuqLwu).

Interestingly, we know the facts at some point as this O0 vs O1 comparison shows: https://godbolt.org/z/WMtygk (the llvm.assumes are folded away as they are trivial). Maybe the pass removing them knows/uses the control conditions but a following one does not?

efriedma-quic commented 4 years ago

The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.

I'm not sure how we can eliminate the loop. We already use the fact that count is a multiple of 16 (or 8 actually).

Compare the assembly from https://godbolt.org/z/Em3ZXP . If the trip count is a multiple of the vector factor, we'll use the vectorized loop for all iterations, and we won't use the scalar tail. Therefore, given a vector factor used, the scalar tail is dead. LLVM figures that out on the right side, but not the left.

LebedevRI commented 4 years ago

The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.

I'm not sure how we can eliminate the loop.

We already use the fact that count is a multiple of 16 (or 8 actually).

Do we though? https://godbolt.org/z/QQi6GC suggests we don't. As another symptom, we fail to recognize that loads are aligned.

jdoerfert commented 4 years ago

The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.

I'm not sure how we can eliminate the loop. We already use the fact that count is a multiple of 16 (or 8 actually).

efriedma-quic commented 4 years ago

The problem is that std::terminate can have any other side effect as well: Tell clang that is not going to happen and you get what you want: https://godbolt.org/z/YuqLwu

The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.

jdoerfert commented 4 years ago

The problem is that std::terminate can have any other side effect as well: Tell clang that is not going to happen and you get what you want: https://godbolt.org/z/YuqLwu

I mean, if std::terminate prints a message, how should it be equivalent to __builtin_unreachable?

Now, the stuff after a noreturn call is always ignored as it is dead per definition. There is no need to have a __builtin_unreachable call there to indicate the program point is unreachable, we already know and therefore removed the call. There is also no reasonable way to change that short of ignoring noreturn on std::terminate.

efriedma-quic commented 4 years ago

We do, in fact, assume that terminate() doesn't return. If you look at the LLVM IR, that should be obvious.

The bug here is that we're missing one particular optimization: eliminating the scalar loop. Not sure what's going on here, specifically, but in general LLVM's reasoning based on branch conditions is relatively weak.

1feda113-18e2-42e3-8624-9c17d4d32ec3 commented 4 years ago

https://godbolt.org/z/G27wEN Calling [[noreturn]] function is identical to the __builtin_unreachable().

The thing is that std::terminate() isn't identical to __builtin_unreachable(), it actually has to be called and will terminate the program, while __builtin_unreachable() simply tells the compiler that this branch won't ever be taken, with zero indication of the error if it turns out to be actually taken during runtime.

I am not talking about them being equivalent.

I am saying that the compiler ought to hard-assume that [[noreturn]] functions never return.

Right now, placing a __builtin_unreachable() after a std::terminate() causes the __builtin_unreachable() to be ignored, which is not helpful:

https://godbolt.org/z/QQi6GC

LebedevRI commented 4 years ago

https://godbolt.org/z/G27wEN Calling [[noreturn]] function is identical to the __builtin_unreachable().

The thing is that std::terminate() isn't identical to __builtin_unreachable(), it actually has to be called and will terminate the program, while __builtin_unreachable() simply tells the compiler that this branch won't ever be taken, with zero indication of the error if it turns out to be actually taken during runtime.

1feda113-18e2-42e3-8624-9c17d4d32ec3 commented 4 years ago

Apparently from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95001 I was not clear what I was asking, so let me post what I commented there here as well:

Just to clarify what I'm asking for:

Calling a [[noreturn]] function ought to have the same effects on codegen as:

[[noreturn]] void theend();
...
if(a)
{
  theend();
  __builtin_unreachable();
}

Right now, the [[noreturn]] attribute causes the __builtin_unreachable() to be ignored, which is not helpful.

I think there should be an implicit __builtin_unreachable() if a [[noreturn]] function ever would return. To be specific here: I only care about codegen.

If that's not doable, can you please make a __builtin_unreachable() called after a [[noreturn]] function not be ignored please?

llvm / llvm-project

`std::terminate()` and `abort()` do not have `__builtin_unreachable()` semantics #45184

Extended Description