openmc-dev / openmc

OpenMC Monte Carlo Code
https://docs.openmc.org
Other
778 stars 507 forks source link

Random Ray Inner Loop Optimization + Vectorization #2844

Open jtramm opened 10 months ago

jtramm commented 10 months ago

Performance of the random ray solver in OpenMC is highly sensitive to the performance of the flux attenuation kernel that forms the inner loop of the simulation. This inner loop is responsible for performing the attenuation (and source accumulation + attenuation) of the angular flux for a ray crossing a single flat source region, for all energy groups. The inner loop is formed in a SIMD fashion over all energy groups, allowing for potential vectorization performance gains on most architectures (particularly important when energy group count is high, less important for e.g., 7 group problems). Some compiler hints (e.g., #pragma omp simd), loop re-organization, manual memory alignment intrinsics, and/or inlining of the exponential evaluation may be required.

There are other potential optimizations in this kernel as well that @gridley has proposed, e.g., templating the function to reduce branching, and movement of the locking operation.

jtramm commented 9 months ago

Another idea proposed by @gridley is to experiment with SOA vs. AOS for the main source region/source element data structures. While this would be likely to slow down the iteration update functions, it may help improve cache efficiency for the flux attenuation kernel.