Open Sh0g0-1758 opened 1 month ago
Note that on Linux targets, clang uses libgcc by default, so no LLVM code is actually involved in the computation. We do use the compiler-rt implementation on some targets, though (compiler-rt/lib/builtins/mulsc3.c).
I don't know of a fast algorithm that's correctly rounded... do you know about any research in that direction? I'm not sure we can reasonably do anything here without that. (MPC can do infinite-precision arithmetic, but that's quite slow relative to using native hardware FP.)
CC @lntue
Note that on Linux targets, clang uses libgcc by default, so no LLVM code is actually involved in the computation. We do use the compiler-rt implementation on some targets, though (compiler-rt/lib/builtins/mulsc3.c).
I don't know of a fast algorithm that's correctly rounded... do you know about any research in that direction? I'm not sure we can reasonably do anything here without that. (MPC can do infinite-precision arithmetic, but that's quite slow relative to using native hardware FP.)
CC @lntue
I have some ideas, so we plan to try to implement them in LLVM libc and compare the performance. If it works out ok, we will see how to port that back.
Refer this report : https://inria.hal.science/hal-04714173, which hints towars the issue in CMPLX FP multiplication.
This can be observed in the following example in which I compare the results from GNU MPC (infinite precision) with the results from clang (trunk) and gcc (trunk) for 32 bit precision.