I experienced a bug in code generation for the x86-64 target. For the minimal test case
test.txt
compiled on
$ clang++-16 -v
Ubuntu clang version 16.0.0 (1~exp5ubuntu3)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
Candidate multilib: .;@m64
Selected multilib: .;@m64
with
clang++-16 -Og -S test.cc
I get the assembly code
f(double, double, unsigned int, bool): # @f(double, double, unsigned int, bool)
test esi, esi
jne .LBB0_5
mov eax, edi
cvtsi2sd xmm2, rax
mulsd xmm2, xmm1
addsd xmm1, xmm2
movapd xmm3, xmm0
cmpnltpd xmm3, xmm2
cmpnltpd xmm1, xmm0
....
see also https://godbolt.org/z/s3v3rdfsd
Prior content in the register xmm2 from before entering this function in the upper lane can trigger a floating point exception in the second to last line for cmpnltpd xmm3, xmm2. Specifically, I see
(gdb) p $xmm2
$1 = {v2_double = {0.40000000000000002, nan(0xc000000000000)}}
showing that the upper lane contains an invalid entry. The generated code does not give the FPE with clang-15, nor does it with the optimization level -O0. Using -O2, -O3 also leads to the invalid code according to godbolt both for clang-15 and clang-16.
Please let me know if I should provide a main function to invoke this. All one needs to do is to set xmm2 to _mm_set1_pd(std::numeric_limits<float>::signaling_NaN()); and call feenableexcept(FE_DIVBYZERO | FE_INVALID); before calling f(0.2, 0.2, 2, false);. I could be wrong and something might be disallowed by my code, but I believe this is valid code and wrong within LLVM.
Hello,
I experienced a bug in code generation for the x86-64 target. For the minimal test case test.txt compiled on
with
I get the assembly code
see also https://godbolt.org/z/s3v3rdfsd Prior content in the register
xmm2
from before entering this function in the upper lane can trigger a floating point exception in the second to last line forcmpnltpd xmm3, xmm2
. Specifically, I seeshowing that the upper lane contains an invalid entry. The generated code does not give the FPE with clang-15, nor does it with the optimization level
-O0
. Using-O2
,-O3
also leads to the invalid code according to godbolt both for clang-15 and clang-16.Please let me know if I should provide a main function to invoke this. All one needs to do is to set
xmm2
to_mm_set1_pd(std::numeric_limits<float>::signaling_NaN());
and callfeenableexcept(FE_DIVBYZERO | FE_INVALID);
before callingf(0.2, 0.2, 2, false);
. I could be wrong and something might be disallowed by my code, but I believe this is valid code and wrong within LLVM.Note that the code is extracted from a big project, https://github.com/dealii/dealii/issues/15496#issuecomment-1609945214