llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.87k stars 11.92k forks source link

Support __fp16 vectors #23679

Open ahmedbougacha opened 9 years ago

ahmedbougacha commented 9 years ago
Bugzilla Link 23305
Version trunk
OS All
Blocks llvm/llvm-project#51454
CC @ahmedbougacha,@tlively

Extended Description

__fp16 is a storage-only type, and there are two CodeGen variants:

In both cases, we don't do the right thing for vectors.

On X86, this:

typedef __fp16 __attribute__((__ext_vector_type__(4))) v4f16;

void foo(v4f16 *a, v4f16 *b, v4f16 *c) {
  *c = *a + *b;
}

generates the very broken:

  %3 = add <4 x i16> %1, %2

This is because the Sema::UsualUnaryConversions don't apply to VectorTypes (see Sema::CheckVectorOperands), so we never try to promote to v4f32 (as we would promote __fp16 to f32).

Even if we decide to reject that code and never do the implicit promotion, the alternative is also broken:

typedef __fp16 __attribute__((__ext_vector_type__(4))) v4f16;
typedef float __attribute__((__ext_vector_type__(4))) v4f32;

void foo(v4f16 *a, v4f16 *b, v4f16 *c) {
  *c = __builtin_convertvector(*a, v4f32);
}

Generates:

  %2 = uitofp <4 x i16> %1 to <4 x float>

Even when "half" is used instead of i16 (AArch64, or after we migrate away from the convert intrinsics), we generate IR without the promotion:

  %3 = fadd <4 x half> %1, %2

Relying on the backend to do the promotion. However, this has slightly different semantics, because LLVM works at the instruction level, and clang at the expression level. Consider:

void foo(v4f16 *a, v4f16 *b, v4f16 *c) {
  *c = (*a + *b) + *c;
}

Doing the promotion in clang means the intermediate result is a v4f32. Doing it in LLVM means the intermediate result is truncated back to v4f16, before being extended again to v4f32.

This can give different result, and it's probably best to mirror the scalar clang behavior of promoting entire expressions.

llvmbot commented 2 years ago

mentioned in issue llvm/llvm-project#51454