Closed tkoenig1 closed 1 year ago
This transformation is only useful when optimizing, but not when optimizing for size (via -Os
).
Whenever there is hardware, based on timing this can be considered.
Try again with commit af7ee91ac91db0ac489c686eb35daf15ceb1f32f. Transformation not done when compiling for size.
Code is now
carry r2,{O}
mul r1,r1,#5675921253449092805
srl r1,r2,<0:2>
ret
so this is now implemented. Thanks!
This optimization generates larger code and often uses an extra register. As far as it being faster, until we have hardware its not clear.
This optimization has been removed.
... by multiplication with the inverse, as described in Hacker's Delight and other sources.
gives a straightforward, but slow
compared to (same compiler, x86)
which is likely to be much faster.