Open shahmoradi opened 1 year ago
Why would this need to be intrinsic? Can't you define this as an elemental generic?
We could ask the same question about other existing intrinsics, like abs()
, cosd()
, sind()
.
Indeed, I have been using FPP macros for cosd()
, sind()
.
I see multiple paths forward:
absq()
to achieve such simple generic functionality. In my current use case, the function has to be called inside a triply-nested loop. I am not ready yet to endure such pain and potential performance impact. So, instead, I have been using FPP macros to handle this usage gracefully.absq()
.For now, I am stuck with option 4.
Would this be a good fit for stdlib?
- generic user-defined template function. Possibly a good solution. I look forward to using it in the next 5-10 years.
F'2023 is done. You won't get anything new, including your absq
, into a Fortran standard, until F'202Y, where Y is at least 8, or into a Fortran compiler until a few years after that. So by the time you get absq
, assuming that you can get it into a Fortran standard, you'll also have the generic user-defined template functions and you won't need a standard absq
. (Assuming that the template feature is capable of implementing absq
, that is.)
It would be an excellent addition to stdlib if we ensure the compilers inline all usage.
I hoped this would be the case, but I have noticed performance degradations using (a similar functionality to) stdlib optval()
. I need to find and rerun the benchmarks to quantify the impact here.
Alternatively, could stdlib define an FPP include
(header) file with macros for such frequently used functionalities?
I am unsure whether all available compilers follow the same FPP conventions to allow defining such an FPP include
file.
Intel and GNU FPP overlap significantly (though incompletely), but I do not know about the others.
Why cannot a compiler optimizer automatically optimize abs(x)**2
?
A frequent type-agnostic calculation is
abs(x)**2
wherex
iscomplex
orreal
(orinteger
). While the above expression is generic, it involves a costly but, more importantly, unnecessary square-root operation for values of typecomplex
. The current community solution is to define a preprocessor macro that properly and efficiently computes the result for arguments of various types. This can be avoided by adding a new intrinsic functionabsq(x)
that returns,x**2
ifx
isreal
orinteger
.x%re**2 + x%im**2
ifx
iscomplex
.