aklomp / base64

Fast Base64 stream encoder/decoder in C99, with SIMD acceleration
BSD 2-Clause "Simplified" License
865 stars 162 forks source link

Add `BASE64_FORCE_INLINE` macro to always inline inner loop functions #136

Closed aklomp closed 6 months ago

aklomp commented 7 months ago

Add a BASE64_FORCE_INLINE macro that has the effect of ensuring that a function is always inlined, even when the compiler would normally not inline it (e.g. due to disabling optimizations or when doing certain debug builds).

This macro is applied to a number of very hot inner loop functions that were always intended to be fully inlined, such as the various enc_translate and enc_reshuffle functions, but which were broken out into separate functions to make the data flow easier to follow. Making them separate functions had the side effect that the compiler would sometimes choose not to inline them. Applying this macro respects the author's intent, and ensures that the library is performant even when building with few or no optimizations.

Tests show that this increases benchmark scores for 32-bit SSSE3 decoding, and probably similar on other platforms.