Open zimmermann6 opened 2 years ago
It's a real pleasure to hear from such a well-known and remarkable group! Please expect our response soon.
My current understanding is that we would like to see this work succeed for AMD CPUs and GPUs. On the GPU side, how would you like to communicate?
I'm not sure to understand what you mean by "communicate" in this context. I'm not a GPU expert. My colleague Vincenzo Innocenti did a few experiments on GPU with ROCm (and reported a few issues to you).
I think all the tools you need to develop a correctly rounded math library for HIP or OpenMP offload language applications are available. But I expect there may be information about GPU programming, HIP language details, runtime behavior, etc., that you may need that is not in the documentation. We can discuss that here if desired, or use another channel.
sorry if our initial message was not clear. We will provide generic implementations in the C language, but we will not deal with the integration into the different math libraries. However if we get feedback for the routines already available at https://homepages.loria.fr/PZimmermann/CORE-MATH/, we can arrange so that the integration will be as easy as possible, as long as it does not make integration into the other libraries harder.
Sorry for my confusion. I understand better now what is being proposed.
is there any chance you can integrate the code from https://gitlab.inria.fr/core-math/core-math/-/tree/master/src/binary32 (single precision), for example the powf code (https://gitlab.inria.fr/core-math/core-math/-/blob/master/src/binary32/pow/powf.c)?
@zimmermann6 I'm afraid that is not likely. There is quite a willingness here to trade accuracy for performance. A correctly rounded library would be good as an alternative though, but I currently don't have time to port the code to the GPU.
ok, more details are available here : https://hal.inria.fr/hal-03721525
Thanks!
please find a new version of our analysis of the "Accuracy of Mathematical Functions" updated to latest available (for ROCm 5.4.0)
we have updated our comparison:
https://members.loria.fr/PZimmermann/papers/accuracy.pdf
This is a new version with updated versions of the different libraries and:
for ROCm is updated to 5.6.0 with only minor differences w/r/t to the previous version.
we have updated our comparison:
https://members.loria.fr/PZimmermann/papers/accuracy.pdf
This is a new version with updated versions of the different libraries, new corner cases found, and the ARM Performance Library is now included.
no updates for ROCm in this version
Thank you for the notice!
the current C working draft [1, p392] has reserved names for correctly rounded functions (cr_exp, cr_log, cr_sin, ...).
We propose to provide such correctly rounded implementations for the three IEEE formats (binary32, binary64, binary128) and the "extended double" format (long double on x86_64).
These implementations will be correctly rounded for all rounding modes, for example one could do the following to emulate interval arithmetic:
fesetround (FE_DOWNWARD); y_lo = cr_exp (x_lo); fesetround (FE_UPWARD); y_hi = cr_exp (x_hi);
Users who want a fast implementation will call the exp/log/sin/... functions, users who want a correctly rounded function and thus reproducible results (whatever the hardware, compiler or operating system) will use the cr_exp/cr_log/cr_sin/... functions. Our goal is nevertheless to get the best performance possible.
Our objective is to provide open-source implementations that can be integrated in the major mathematical libraries (GNU libc, Intel Math Library, AMD Libm, Redhat Newlib, OpenLibm, Musl, llvm-libc, CUDA, ROCm).
Are developers of ROCm interested by such functions? If so, we could discuss what would be the requirements for integration in ROCm in terms of license, table size, allowed operations.
We have started to work on two functions (cbrt and acos), for which we provide presumably correctly rounded implementations (up to the knowledge of hard-to-round cases) [2].
Christoph Lauter Jean-Michel Muller Alexei Sibidanov Paul Zimmermann
[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf [2] https://homepages.loria.fr/PZimmermann/CORE-MATH/