nimble-dev / nimble

The base NIMBLE package for R
http://R-nimble.org
BSD 3-Clause "New" or "Revised" License
158 stars 24 forks source link

inline many core functions #1349

Closed perrydv closed 11 months ago

perrydv commented 1 year ago

This PR changes the following C++ functions in Utils.h to be inlined. That means they now use the C++ keyword inline (previously they called a library function that on-the-fly compilation would link to) and thus the function contents get placed directly where called and are compiled there. This reduces function call overhead and presumably allows compiler optimizations. A user shared that the new pow_int in version 1.0.0 introduced a surprisingly large slowdown in performance compared to previous versions, and this was traced to pow becoming pow_int in user-defined code.

This PR inlines the following (with notes on performance [before --> after] on a mac for 10 million calls, timed in seconds):

Cases without notes went from about 0.03-->2e-6, and that 0.03 might be just the baseline cost of packing up function calls.

All of these seem fast before or after the changes (these are cumulative times for 10 million calls). Yet, the pow_int was the source of a serious slowdown, so perhaps the overhead and costs compound in more complex compiled situations using lots of memory etc. The benchmarks I ran were isolated and had nothing else going on.

danielturek commented 1 year ago

@perrydv The performance results are remarkable.

Who ever knew the power of the cpp keywork inline ?

perrydv commented 1 year ago

Thanks @danielturek . Well Eigen for example makes heavy use of inlining. It will be interesting to see what kind of net performance gains these give, whether they are swamped by other costs.