Closed nakedible closed 8 months ago
Basically the method I used was that I made a bunch of simple wrapper functions like this:
#[inline(never)]
fn try_expand(v: RangedI32<3, 63>) -> RangedI32<-2, 64> {
v.expand()
}
And then used cargo asm
to dump the function before and after removing the unsafe bit, and ensuring that there was no change in the assembly output:
deranged::try_expand:
mov eax, edi
ret
I might be able to wrap these in some testing setup where it would be possible to automatically dump the assembly of all relevant functions to a file. It's often beneficial to be able to see how the assembly changes when changing the implementation – for example if suddenly there's the addition of a panic handler or something that is unexpected, or a zero runtime cost abstraction ends up having runtime cost.
Doing that would help to an extent, but ultimately I've got to be as certain as possible it's a zero-cost abstraction. deranged
is depended upon by time
, which gets hundreds of thousands of downloads every day. I am putting quite a bit of effort into optimization nowadays for a few different reasons. Even a small change will have a notable impact.
If you ultimately don't want to trust the compiler that it is able to optimize away cases like this, then that's a clear position to take – me attempting to show that the compiler can in fact deduce the checks away is not going to change that.
I might take one more stab at this by formulating the change in a different way, but this second commit will need to be fully rewritten.
Closing per my comment above. I created a trivial case with a demonstrable difference in generated assembly, which is not an acceptable tradeoff.
The compiler is smart in removing comparisons that it already knows are not true. This allows us to reduce the use of
unsafe
in safe functions without affecting the produced code.Each one of these code changes has been tested to not produce any difference in the generated assembly in a release build (with the input given as
std::hint::black_box
so the compiler won't optimize on the static value given in code). Of course no guarantees can be given that every Rust compiler on every platform always will make the same optimizations.I consider
new_saturating
to be a fundamental and useful method to expose publicly, but I left it here aspub(crate)
as to not change the public API of the library.