Inline the swap trampoline on x86_64 and AArch64

edef1c / libfringe

a Rust library implementing safe, lightweight context switches, without relying on kernel services

https://edef1c.github.io/libfringe

Apache License 2.0

512 stars 31 forks source link

Inline the swap trampoline on x86_64 and AArch64 #62

Closed Amanieu closed 8 years ago

Amanieu commented 8 years ago

This significantly improves performance and even fixes #59 as a bonus!

Amanieu commented 8 years ago

This optimization isn't beneficial on x86 and or1k since both of those architecture require a call instruction to perform PC-relative addressing, which defeats the point of inlining.

whitequark commented 8 years ago

This optimization isn't beneficial on x86 and or1k since both of those architecture require a call instruction to perform PC-relative addressing, which defeats the point of inlining.

This is not strictly true. Even if OR1K had a dedicated PC-relative addressing instruction, it would still not benefit from this optimization as no OR1K implementation comparatively penalizes calls here.

MarkSwanson commented 8 years ago

Great! It seems to pass all of my tests! Also, swap 31ns down to 21ns in my Mac -> VirtualBox -> Ubuntu system.