jhjourdan / SIMD-math-prims

Vectorizable implementations of some mathematical functions
MIT License
102 stars 12 forks source link

Accuracy issue with logapprox_d using single precision #3

Closed dacmot closed 3 years ago

dacmot commented 4 years ago

Bonjour. I'm re-implementing your functions using vectorclass because my compiler (Visual Studio 2017) doesn't like the log and exp union tricks and refuses to vectorize them. Doing so I've templated the functions so it can be used with SSE, AVX or AVX512 types, and also single or double precision.

The more precise approx_d versions for exp, sin and cos all seem to work with single precision arithmetic, although yielding negligible improvements in accuracy. For log however, the accuracy is way off. I'm using the beginning of the logapprox function for single precision, combined with polynomial of the double precision.

Would you have any insight into why this is happening? My first guess would be floating point error/instability. If that's the case, maybe the order of the polynomial terms could be altered to avoid cancellations or round errors.

jhjourdan commented 4 years ago

What kind of inaccuracy are you observing? Could you show a plot of the absolute error for inputs between 1 and 4 ?

Note that contrarily to exp, sin and cos, the log function is difficult to approximate using polynomials. It may result that the polynomial chosen are rather unstable numerically. Another possible problem lies in the computation addcst + 0.693147180559945309*exp, which basically adds two numbers which are close to being opposite, which is not good for accuracy.

One last comment: my experiments shows that mixing single and double precision computation in vectorized loops is usually not such a good idea, because the instructions that convert between single and double precision tend to be rather slow.

dacmot commented 4 years ago

The error is more or less constant for all inputs, approximately 0.54408.

I'm not mixing and converting between single and double precision. I templated the function, but because of the exp, addcst and the union trick I have to handle the two cases separately. I use compile-time mechanisms to switch between the logapprox (s.p.) and logapprox_d (d.p.) tricks depending on the input template type, but using the same polynomial.

jhjourdan commented 4 years ago

I have not seen your code, but if the error is constant for all inputs, then this is probably a matter of changing the value of addcst accordingly. You probably failed computing the right value for it.

jhjourdan commented 3 years ago

There has been no activity in this issue for a long time. I assume this is no longer a problem. I am closing. Feel free to complain if this is still a problem for you.