Closed romange closed 7 years ago
If only I had some C++ points to award!
Thanks! The main challenge was to divide the unrolled code into generic invariants. I did it by making each step to fully handle one integer - input for packing and output for unpacking. Once this was done the rest was easy. The template part is relatively simple thanks to c++11. Also, I used some plain "if" constructs where I could knowing that compiler's constant propagation would eliminate them completely.
Checked with
gcc -03 -S
that the code generated is absolutely the same.clang-3.8 returns differences in the assembler but from what i saw it's mostly different registers and slightly changed order. The number of instructions is the same in both versions.