Closed Amanieu closed 8 years ago
This optimization isn't beneficial on x86 and or1k since both of those architecture require a call
instruction to perform PC-relative addressing, which defeats the point of inlining.
This optimization isn't beneficial on x86 and or1k since both of those architecture require a call instruction to perform PC-relative addressing, which defeats the point of inlining.
This is not strictly true. Even if OR1K had a dedicated PC-relative addressing instruction, it would still not benefit from this optimization as no OR1K implementation comparatively penalizes calls here.
Great! It seems to pass all of my tests! Also, swap 31ns down to 21ns in my Mac -> VirtualBox -> Ubuntu system.
This significantly improves performance and even fixes #59 as a bonus!