Open heltonmc opened 1 year ago
Very good point, thank you! My Fortran implementation is not as polished as yours, though it gets pretty good performance already. I'll definitely put this improvement in the pipeline.
BTW: Fortran has no sincos
function but all compilers will detect when both sin
and cos
are requested, and use a sincos
function if it's available. An argument-sum wrapper would look like:
elemental real(BK) function cos_sum(a,b)
real(BK), intent(in) :: a, b
real(BK) :: ca,sa,cb,sb
ca = cos(a)
sa = sin(a)
cb = cos(b)
sb = sin(b)
cos_sum = (ca*cb-sa*sb)
end function
Ya there’s a couple of ways to do it but it pops up frequent enough that having a function for it will be helpful. It will be slightly slower of course but is much more accurate even near zeros.
You could also reduce each argument within the sin first before adding them together. That will also significantly improve the accuracy. In general only one argument needs to be reduced as the other one is usually <pi. So that will be faster and Definitely more accurate than the naive case but unsure if it’s as accurate as approach above. Probably be similar depending on how good the argument reduction is done!
It will be slightly slower of course but is much more accurate even near zeros.
That's indeed one of the reasons I try to keep whole packages within a single module file. I know many people don't like the approach and split packages into many files. But link time optimization in Fortran is usually pretty bad, and if your whole package is contained within the same file, any compilers will inline all these small functions, so, you get more readable code with no performance loss.
Sounds good! I think just inserting your above code is fine. My comment was more so that this type of function pops up very frequently in all the Bessel function implementations so it's good to have a routine you like and can change easily.
I don't think there's actually going to be much link-time optimization besides if your compiler will inline or not as this would be a strictly accuracy improvement at the cost of several nanoseconds. So that tradeoff is definitely up to you as this will contain more arithmetic operations 😄
This is essentially the outcome of delaying any floating point issues until the very end. All the arithmetic up to this point should be accurate to less than <1.5 ULP. So any numerical errors are mostly contained on this single line. Unfortunately, this line is very expensive even in the naive case compared to rest of routine so even a small change in this will be felt in the total routine.
I was looking through the code and was curious how https://github.com/perazz/fortran-bessels/blob/ee7e6f405f77398b44f67e9a11728d528c6f6f5f/src/bessels.f90#L100 this line held up? I don't know how to exactly test this accuracy 😃 but I found that this simple summation is prone to some huge cancellation. Especially because the two values vary so much in magnitude.
It's much better to do something like...
because it essentially acts as a range reduction putting the two values at similar magnitude. Therefore, when summed they lose less precision. We are working on a slightly different approach but if Fortran has a good function to give you both
sin
andcos
this could significantly improve the accuracy! It should only cost a few cycles but will be able to provide much better relative tolerances.