rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.7k stars 12.49k forks source link

Tracking Issue for num_midpoint #110840

Open Urgau opened 1 year ago

Urgau commented 1 year ago

Feature gate: #![feature(num_midpoint)] and #![feature(const_num_midpoint)]

This is a tracking issue for the midpoint function to {u,i}{8,16,32,64,128,size}, NonZeroU{8,16,32,64,size} and f{32,64}.

The midpoint function calculates the middle point of lhs and rhs.

Public API

impl {u,i}{8,16,32,64,128,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl NonZeroU{8,16,32,64,size} {
    pub const fn midpoint(self, rhs: Self) -> Self;
}

impl f{32,64} {
    pub fn midpoint(self, rhs: Self) -> Self;
}

Steps / History

Unresolved Questions

barryfam commented 1 year ago

If this is still open for comments, I'd like to support rounding towards the first argument.

In binary search, a.midpoint(b) will eventually be called where a and b are adjacent integers. If b is an exclusive range bound, it is helpful to know the function will always return a and not b. [insert Zeno's Paradox meme]

Another reason is for symmetry of step size; see CppCon 2019 talk on C++'s std::midpoint, question at 1:00:00:

Q: [Why is midpoint(a, b) different from midpoint(b, a)?]

A: Partially it's, when you think about it, you're going midway from a to b or midway from b to a. But — I suspect that this comes from the people, who were interested in the beginning — if you're doing simulations and you're stepping around a space where you're measuring things as you step, you start at 1,000 and you step halfway over and over again, you want those steps to be consistent. It doesn't matter which direction you're coming from. If you're coming from +1,000 toward 0 or -1,000 up to 0, you want those steps to be the same size whether you're going counting down or counting up.

In other words, they want:

midpoint(7, 0) = 4    // vs. "round to -∞" would be 3
midpoint(-7, 0) = -4
scottmcm commented 1 year ago

In binary search, a.midpoint(b) will eventually be called where a and b are adjacent integers.

I don't find this part persuasive, as ever binary search I've ever seen has had a ≤ b as an invariant, and thus round-to-first and round-to-negative-infinity are the same.

you want those steps to be consistent

Can you elaborate on the value of "consistent" here? I would have thought that if one needed predictable and consistent steps, they'd use a power-of-two number of quanta, since otherwise each step isn't half the size of the previous one.


I guess this boils down to midpoint(a, b) == midpoint(b, a) and midpoint(-a, -b) == -midpoint(a, b) both being "obvious" expectations, but they're incompatible with each other. (Kind of like how -a / -b = -(a / b) and a % b ε [0, |b|) are also "obvious" but incompatible.)

I'm personally still of the opinion that being consistent with (a + b) / 2 (for unsigned values where that doesn't overflow) is the way to go. After all, even P0811R3 used that as the example of how people would write it if that didn't overflow. And it sure would be nice to let clippy say "replace (a + b) / 2 with midpoint" without needing to warn that it might change the behaviour in non-overflow cases too.


And since it was in the PR, not the tracking issue, cc https://github.com/rust-lang/rust/pull/92048#issuecomment-1515466697 where I argued for rounding to -∞ instead of to first.


Also, FWIW, cranelift has https://docs.rs/cranelift/latest/cranelift/prelude/trait.InstBuilder.html#method.avg_round for (a + b + 1) >> 1 without overflow, which is rounding to +∞. I'm not sure why it picked that one, but if it has it from WASM that might be an interesting proposal to explore for rationale too.


EDIT: I happened to stumble on this post accidentally today https://devblogs.microsoft.com/oldnewthing/20220207-00/?p=106223, which is also about being consistent with (a + b)/2.

bluebear94 commented 10 months ago

For x86 targets, wouldn’t it be possible to implement {u32, u64}::num_midpoint using the RCR instruction?

scottmcm commented 8 months ago

@bluebear94 Exactly what assembly instructions is more up to LLVM than to us. For now the SHLD approach it gives us is pretty good -- only one instruction more -- someone could always open an LLVM bug for a different instruction later.

RustyYato commented 7 months ago

Why doesn't f32::midpoint upcast to f64 to calculate the midpoint? This should give the same results (excluding NaNs, which may have different bit values). This should enable a very simple branchless implementation, Godbolt+/+2.0)+as+f32%0A%7D%0A%0A%23%5Binline(never)%5D%0Apub+fn+midpoint_std(x:+f32,+y:+f32)+-%3E+f32+%7B%0A++++x.midpoint(y)%0A%7D%0A'),l:'5',n:'0',o:'Rust+source+%231',t:'0')),k:67.88202884459568,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((g:!((h:compiler,i:(compiler:nightly,filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:rust,libs:!(),options:'-Copt-level%3D3',overrides:!(),selection:(endColumn:30,endLineNumber:7,positionColumn:30,positionLineNumber:7,selectionStartColumn:30,selectionStartLineNumber:7,startColumn:30,startLineNumber:7),source:1),l:'5',n:'0',o:'+rustc+nightly+(Editor+%231)',t:'0')),k:32.117971155404305,l:'4',m:67.779632721202,n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compilerName:'rustc+1.75.0',editorid:1,fontScale:14,fontUsePx:'0',j:1,wrap:'1'),l:'5',n:'0',o:'Output+of+rustc+nightly+(Compiler+%231)',t:'0')),header:(),l:'4',m:32.220367278798,n:'0',o:'',s:0,t:'0')),k:32.117971155404305,l:'3',n:'0',o:'',t:'0')),l:'2',n:'0',o:'',t:'0')),version:4)

fn midpoint(a: f32, b: f32) {
    ((a as f64) + (b as f64) / 2.0) as f32
}

I checked this with kani, and for all finite values it is bit-wise identical to the current implementation and they both output NaNs at the same time.

NOTE: that there aren't any calls to kani::assume, so this works for every pair of f32.

#[kani::proof]
fn test_midpoint() {
    let a = kani::any::<f32>();
    let b = kani::any::<f32>();

    // my proposed implementation
    let c = (a as f64 + b as f64) / 2.0;
    let c = c as f32;

    let d = a.midpoint(b); // the current std implementation

    if c.is_nan() {
        assert!(d.is_nan())
    } else {
        assert_eq!(c, d)
    }
}

Run kani with this command NOTE: this needs --no-overflow-checks because by default kani checks that addition doesn't cause NaNs, but we don't care about that here and there isn't any possibility of overflow anyways.

cargo kani --no-overflow-checks
Urgau commented 7 months ago

@RustyYato As noted in the implementation PR, f64 and f32 follows the libcxx implementation.

This should enable a very simple branchless implementation, Godbolt+/+2.0)+as+f32%0A%7D%0A%0A%23%5Binline(never)%5D%0Apub+fn+midpoint_std(x:+f32,+y:+f32)+-%3E+f32+%7B%0A++++x.midpoint(y)%0A%7D%0A'),l:'5',n:'0',o:'Rust+source+%231',t:'0')),k:67.88202884459568,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((g:!((h:compiler,i:(compiler:nightly,filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:rust,libs:!(),options:'-Copt-level%3D3',overrides:!(),selection:(endColumn:30,endLineNumber:7,positionColumn:30,positionLineNumber:7,selectionStartColumn:30,selectionStartLineNumber:7,startColumn:30,startLineNumber:7),source:1),l:'5',n:'0',o:'+rustc+nightly+(Editor+%231)',t:'0')),k:32.117971155404305,l:'4',m:67.779632721202,n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compilerName:'rustc+1.75.0',editorid:1,fontScale:14,fontUsePx:'0',j:1,wrap:'1'),l:'5',n:'0',o:'Output+of+rustc+nightly+(Compiler+%231)',t:'0')),header:(),l:'4',m:32.220367278798,n:'0',o:'',s:0,t:'0')),k:32.117971155404305,l:'3',n:'0',o:'',t:'0')),l:'2',n:'0',o:'',t:'0')),version:4)

Feel free to send a PR changing the implementation (code source).

RustyYato commented 7 months ago

Alright, I've put up https://github.com/rust-lang/rust/pull/121062, :crossed_fingers:

tgross35 commented 1 month ago

https://github.com/rust-lang/rust/blob/e35364a521372ce682e4bd4a5850d97ea33b0eab/library/core/src/num/f64.rs#L1052-L1071

I think the first condition could have an || a.is_sign_positive() !- b.is_sign_positive(), since this is another never overflowing case.

Also, MIN_POSITIVE is the min positive normal number, like in the C++ implementation, but we might be able to adjust this to f32::From_bits(0b10) (2 * min subnormal) since Rust shouldn't flush subnormals (cc https://github.com/rust-lang/rfcs/pull/3514).

The default branch's comment is incorrect too, that operation is the safe one. I have a commit to update it in https://github.com/rust-lang/rust/pull/127027.

ronnodas commented 4 days ago

I guess this boils down to midpoint(a, b) == midpoint(b, a) and midpoint(-a, -b) == -midpoint(a, b) both being "obvious" expectations, but they're incompatible with each other. (Kind of like how -a / -b = -(a / b) and a % b ε [0, |b|) are also "obvious" but incompatible.)

Round-towards-zero (or round-away-from-zero) satisfies both of those properties so they're not incompatible.