ROCm / llvm-project

This is the AMD-maintained fork of the LLVM git repository. This repository accepts pull requests and issues related to AMD fork-specific topics (amd/*). For all other issues/PRs, please submit upstream at https://github.com/llvm/llvm-project.
Other
97 stars 52 forks source link

correctly rounded mathematical functions? #74

Open zimmermann6 opened 2 years ago

zimmermann6 commented 2 years ago

the current C working draft [1, p392] has reserved names for correctly rounded functions (cr_exp, cr_log, cr_sin, ...).

We propose to provide such correctly rounded implementations for the three IEEE formats (binary32, binary64, binary128) and the "extended double" format (long double on x86_64).

These implementations will be correctly rounded for all rounding modes, for example one could do the following to emulate interval arithmetic:

fesetround (FE_DOWNWARD); y_lo = cr_exp (x_lo); fesetround (FE_UPWARD); y_hi = cr_exp (x_hi);

Users who want a fast implementation will call the exp/log/sin/... functions, users who want a correctly rounded function and thus reproducible results (whatever the hardware, compiler or operating system) will use the cr_exp/cr_log/cr_sin/... functions. Our goal is nevertheless to get the best performance possible.

Our objective is to provide open-source implementations that can be integrated in the major mathematical libraries (GNU libc, Intel Math Library, AMD Libm, Redhat Newlib, OpenLibm, Musl, llvm-libc, CUDA, ROCm).

Are developers of ROCm interested by such functions? If so, we could discuss what would be the requirements for integration in ROCm in terms of license, table size, allowed operations.

We have started to work on two functions (cbrt and acos), for which we provide presumably correctly rounded implementations (up to the knowledge of hard-to-round cases) [2].

Christoph Lauter Jean-Michel Muller Alexei Sibidanov Paul Zimmermann

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf [2] https://homepages.loria.fr/PZimmermann/CORE-MATH/

b-sumner commented 2 years ago

It's a real pleasure to hear from such a well-known and remarkable group! Please expect our response soon.

b-sumner commented 2 years ago

My current understanding is that we would like to see this work succeed for AMD CPUs and GPUs. On the GPU side, how would you like to communicate?

zimmermann6 commented 2 years ago

I'm not sure to understand what you mean by "communicate" in this context. I'm not a GPU expert. My colleague Vincenzo Innocenti did a few experiments on GPU with ROCm (and reported a few issues to you).

b-sumner commented 2 years ago

I think all the tools you need to develop a correctly rounded math library for HIP or OpenMP offload language applications are available. But I expect there may be information about GPU programming, HIP language details, runtime behavior, etc., that you may need that is not in the documentation. We can discuss that here if desired, or use another channel.

zimmermann6 commented 2 years ago

sorry if our initial message was not clear. We will provide generic implementations in the C language, but we will not deal with the integration into the different math libraries. However if we get feedback for the routines already available at https://homepages.loria.fr/PZimmermann/CORE-MATH/, we can arrange so that the integration will be as easy as possible, as long as it does not make integration into the other libraries harder.

b-sumner commented 2 years ago

Sorry for my confusion. I understand better now what is being proposed.

zimmermann6 commented 2 years ago

is there any chance you can integrate the code from https://gitlab.inria.fr/core-math/core-math/-/tree/master/src/binary32 (single precision), for example the powf code (https://gitlab.inria.fr/core-math/core-math/-/blob/master/src/binary32/pow/powf.c)?

b-sumner commented 2 years ago

@zimmermann6 I'm afraid that is not likely. There is quite a willingness here to trade accuracy for performance. A correctly rounded library would be good as an alternative though, but I currently don't have time to port the code to the GPU.

zimmermann6 commented 1 year ago

ok, more details are available here : https://hal.inria.fr/hal-03721525

b-sumner commented 1 year ago

Thanks!

VinInn commented 1 year ago

please find a new version of our analysis of the "Accuracy of Mathematical Functions" updated to latest available (for ROCm 5.4.0)

https://members.loria.fr/PZimmermann/papers/accuracy.pdf

VinInn commented 9 months ago

we have updated our comparison:

https://members.loria.fr/PZimmermann/papers/accuracy.pdf

This is a new version with updated versions of the different libraries and:

for ROCm is updated to 5.6.0 with only minor differences w/r/t to the previous version.

VinInn commented 4 months ago

we have updated our comparison:

https://members.loria.fr/PZimmermann/papers/accuracy.pdf

This is a new version with updated versions of the different libraries, new corner cases found, and the ARM Performance Library is now included.

no updates for ROCm in this version

b-sumner commented 4 months ago

Thank you for the notice!