Here are the performance optimizations I discussed. Most of them target the expm computations in spartacus_sw. In my tests, the solver was about 45% faster (single core on an Intel laptop, ifort -O3).
Also included in this pull request is a fix to the two-stream kernels which removes the need for any double precision variables/computations by instead adjusting the minimum "k" parameter based on working precision. This change should be tested some more - I only tested the McICA and Spartacus shortwave and longwave solvers and found that net upwelling and downwelling fluxes for the test IFS profiles changed by less than 0.1 W/m2. The benefit is increased performance for single precision computations; 20-25% faster mcica_sw and mcica_lw in my tests. If you want, I can also scrap this and make a pull request for only spartacus_sw.
Many thanks for this! Could I request the following changes please:
Rather than a separate routine calc_reflectance_transmittance_sw_opt, please provide the alternative definitions of calc_reflectance_transmittance_sw within #ifdef USE_RTE_REFTRANS_SW #else #endif. This is because this routine is used in many more places than you have shown, and to easily switch between the two at compile time in all places.
Modification dates have a valid date (not -xx) - the exact date doesn't matter
No single-letter variables (so k becomes k_exponent), much easier for searching through a file
Lower case for variables (so RT_term becomes reftrans_factor) - see radiation/CONVENTIONS
Camel case for parameters so k_min becomes KMin - see radiation/CONVENTIONS
IFS conventions also require that the continuation character (&) is provided both at the end of a line and at the beginning of the continuation line
Thanks!
Hello,
Here are the performance optimizations I discussed. Most of them target the expm computations in spartacus_sw. In my tests, the solver was about 45% faster (single core on an Intel laptop, ifort -O3).
Also included in this pull request is a fix to the two-stream kernels which removes the need for any double precision variables/computations by instead adjusting the minimum "k" parameter based on working precision. This change should be tested some more - I only tested the McICA and Spartacus shortwave and longwave solvers and found that net upwelling and downwelling fluxes for the test IFS profiles changed by less than 0.1 W/m2. The benefit is increased performance for single precision computations; 20-25% faster mcica_sw and mcica_lw in my tests. If you want, I can also scrap this and make a pull request for only spartacus_sw.