MFlowCode / MFC

Exascale simulation of multiphase/physics fluid dynamics
https://mflowcode.github.io
MIT License
137 stars 60 forks source link

Link-time optimization and inlining #328

Open sbryngelson opened 6 months ago

sbryngelson commented 6 months ago

A possible fix for the performance degradation seen on NVHPC when calling subroutines across modules: https://forums.developer.nvidia.com/t/nvhpc-23-11-fortran-does-it-inline-public-subroutines-across-modules/281047/2

Though for cross-file inlining you can try the two-pass method. First create an inline extract library but compiling all files with “-Mextract=lib:libname” replacing “libname” with what you’d like to call it. Then compile with “-Minline=lib:libname” to use the extract library. Inlining is performed prior to the device code generation.