Should we switch to LLVM-native (b)float16?

LLVM is now perfectly capable of emulating 16-bit floating point math on platforms that don't have it. This was not true when our float16 emulation code was written.

This would seem like a no-brainer, but it's slower and precision is worse, because LLVM truncates to the 16-bit type after every op, whereas our emulation just keeps it as a float32 until it's time to cast it to another type, store it, or pass it to an external function.

We've talked in the past about what our emulated float16 math is even for. If you're trying to maximize performance on a platform, you're not going to be using non-native types, so you're possibly just writing a test for something that will actually run on another platform. In that context, the increased precision is actively unhelpful, and the increased performance doesn't matter.

However, even if you buy that argument, switching to LLVM's (b)float16 causes test failures along the lines of:

lerp(36.000000, -6.062500, 1.000000) = -6.000000 instead of -6.062500

lerp with a weight of one doesn't even return the right value. Maybe we fix this piecemeal for our math intrinsics that lower to multiple ops by upcasting internally?

halide / Halide

Should we switch to LLVM-native (b)float16? #8324