llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.82k stars 11.91k forks source link

Buggy optimization of `vfmaddcsh` intrinsics #98306

Open sayantn opened 3 months ago

sayantn commented 3 months ago

The llvm.x86.avx512fp16.maskz.vfmadd.csh intrinsic (and due to that, _mm_maskz_fmadd_sch) is being incorrectly optimized. This code snippet

#include<immintrin.h>
#include<stdio.h>

int main() {
    __m128h a, b, c, r;
    _Float16 array[8];

    a = _mm_setr_ph(0.0, 1.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0);
    b = _mm_setr_ph(0.0, 2.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0);
    c = _mm_setr_ph(0.0, 3.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0);

    r = _mm_maskz_fmadd_sch(0, a, b, c);
    _mm_storeu_ph(array, r);

    for (int i = 0; i < 8; i++){
        printf("%f\n", (float) array[i]);
    }

    return 0;
}

In clang, the unoptimized and optimized output is different. The unoptimized output is the correct one according to Intel. gcc gives the correct output in both.

image

System specification:

llvmbot commented 3 months ago

@llvm/issue-subscribers-backend-x86

Author: Sayantan Chakraborty (sayantn)

The `llvm.x86.avx512fp16.maskz.vfmadd.csh` intrinsic (and due to that, `_mm_maskz_fmadd_sch`) is being incorrectly optimized. This code snippet ```C #include<immintrin.h> #include<stdio.h> int main() { __m128h a, b, c, r; _Float16 array[8]; a = _mm_setr_ph(0.0, 1.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0); b = _mm_setr_ph(0.0, 2.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0); c = _mm_setr_ph(0.0, 3.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0); r = _mm_maskz_fmadd_sch(0, a, b, c); _mm_storeu_ph(array, r); for (int i = 0; i < 8; i++){ printf("%f\n", (float) array[i]); } return 0; } ``` In `clang`, the unoptimized and optimized output is different. The unoptimized output is the correct one according to Intel. `gcc` gives the correct output in both. ![image](https://github.com/llvm/llvm-project/assets/142906350/3e75696e-fb02-4ae0-ab2f-25f1d0637cc7) System specification: - `mingw-w64-x86_64-gcc 14.1.0-3` - `mingw-w64-x86_64-clang 18.1.8-1` - Intel Software Development Emulator v9.33.0
RKSimon commented 2 months ago

CC @phoebewang @KanRobert