TrenchBoot / landing-zone

An open source implementation of an AMD-V Secure Loader.
GNU General Public License v2.0
23 stars 7 forks source link

sha1sum: Optimise rol() helper #5

Closed andyhhp closed 5 years ago

andyhhp commented 5 years ago

Forcing the use of %cl for the rotate count is inefficient. By expressing the rotate to the compiler, it can generate code using the rol $imm form of the instruction.

For an -Os build, bloat-o-meter reports:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-879 (-879) Function old new delta transform 5350 4471 -879 Total: Before=49827, After=48948, chg -1.76%

Signed-off-by: Andrew Cooper andrew.cooper3@citrix.com

dpsmith commented 5 years ago

@andyhhp please push update with @krystian-hebel RB/TB and I will merge.