rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.2k stars 12.7k forks source link

narrowing_rem, narrowing_and #72762

Open leonardo-m opened 4 years ago

leonardo-m commented 4 years ago

To remove some unsafe "as" casts and keep the code safe (lossless) and nice, sometimes I'd like to use a rem+cast. So what do you think about adding this to the stdlib?

trait NarrowRem<Out> {
    fn narrowing_rem(&self, den: Out) -> Out;
}

impl NarrowRem<u8> for u16 {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % u16::from(den)) as u8 }
}
impl NarrowRem<u8> for u32 {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % u32::from(den)) as u8 }
}
impl NarrowRem<u16> for u32 {
    fn narrowing_rem(&self, den: u16) -> u16 { (*self % u32::from(den)) as u16 }
}
impl NarrowRem<u8> for u64 {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % u64::from(den)) as u8 }
}
impl NarrowRem<u16> for u64 {
    fn narrowing_rem(&self, den: u16) -> u16 { (*self % u64::from(den)) as u16 }
}
impl NarrowRem<u32> for u64 {
    fn narrowing_rem(&self, den: u32) -> u32 { (*self % u64::from(den)) as u32 }
}
impl NarrowRem<u8> for u128 {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % u128::from(den)) as u8 }
}
impl NarrowRem<u16> for u128 {
    fn narrowing_rem(&self, den: u16) -> u16 { (*self % u128::from(den)) as u16 }
}
impl NarrowRem<u32> for u128 {
    fn narrowing_rem(&self, den: u32) -> u32 { (*self % u128::from(den)) as u32 }
}
impl NarrowRem<u64> for u128 {
    fn narrowing_rem(&self, den: u64) -> u64 { (*self % u128::from(den)) as u64 }
}

impl NarrowRem<i8> for i16 {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % i16::from(den)) as i8 }
}
impl NarrowRem<i8> for i32 {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % i32::from(den)) as i8 }
}
impl NarrowRem<i16> for i32 {
    fn narrowing_rem(&self, den: i16) -> i16 { (*self % i32::from(den)) as i16 }
}
impl NarrowRem<i8> for i64 {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % i64::from(den)) as i8 }
}
impl NarrowRem<i16> for i64 {
    fn narrowing_rem(&self, den: i16) -> i16 { (*self % i64::from(den)) as i16 }
}
impl NarrowRem<i32> for i64 {
    fn narrowing_rem(&self, den: i32) -> i32 { (*self % i64::from(den)) as i32 }
}
impl NarrowRem<i8> for i128 {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % i128::from(den)) as i8 }
}
impl NarrowRem<i16> for i128 {
    fn narrowing_rem(&self, den: i16) -> i16 { (*self % i128::from(den)) as i16 }
}
impl NarrowRem<i32> for i128 {
    fn narrowing_rem(&self, den: i32) -> i32 { (*self % i128::from(den)) as i32 }
}
impl NarrowRem<i64> for i128 {
    fn narrowing_rem(&self, den: i64) -> i64 { (*self % i128::from(den)) as i64 }
}

impl NarrowRem<u8> for usize {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % usize::from(den)) as u8 }
}
impl NarrowRem<u16> for usize {
    fn narrowing_rem(&self, den: u16) -> u16 { (*self % usize::from(den)) as u16 }
}

impl NarrowRem<i8> for isize {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % isize::from(den)) as i8 }
}
impl NarrowRem<i16> for isize {
    fn narrowing_rem(&self, den: i16) -> i16 { (*self % isize::from(den)) as i16 }
}

An example usage:

fn main() {
    let _x: u8 = 152_u64.narrowing_rem(51_u8);
}

D language performs this lossless cast operation automatically and transparently (while it doesn't perform lossy casts silently):

uint foo(in ulong x) { // Error: cannot implicitly convert
    return x;
}
uint bar(in ulong x) { // OK
    return x % 1000;
}
void main() {}
Lonami commented 4 years ago

Why not the following?

let _x: u8 = (152_u64 % 51_u64) as u8;
leonardo-m commented 4 years ago

Because it contains a naked "as" in user code.

Lonami commented 4 years ago

There is nothing wrong with as in user code, it's not unsafe to use. Furthermore, you probably know that the the RHS fits in the type you will be casting after, although I can see how your proposed implementation would guarantee this to hold.

leonardo-m commented 4 years ago

There is nothing wrong with as in user code

It's just a matter of how you define "unsafe". If you use a narrow definition of memory safety, then you're right. But Rust designers have understood that's a insufficient definition. A cast from floating point to integral value was undefined behaviour. We have patched this problem, leaving a escape hatch (to_int_unchecked). The narrowing cast between two integral values using "as" is well defined, it's not undefined behaviour, but if you write lot of numerical code in Rust you slowly understand that current "as" casts are unsafe and they lead to problems in C code. In Rust there's try_from to avoid this risk (but there isn't a function like try_from that becomes an "as" in release mode only). So a wide coding experience in C/Rust and other languages shows you that reducing the number of naked "as" casts in your user code is a good idea to avoid bugs. That's why I have suggested this narrowing rem that's a hopefully sufficiently common case that's safe (also handled by the D language type system).

RalfJung commented 4 years ago

FWIW I agree that as should be avoided -- in Miri we are now using TryFrom everywhere which makes me feel much better as I know we don't silently truncate anything.

RalfJung commented 4 years ago

However @Elinvynia I am a bit surprised to see the "typesystem" label here. This is just a libs addition not affecting the type system, isn't it?

Elinvynia commented 4 years ago

From my understanding of this request, I thought that it having to do with "type conversions" would be a good reason to include that label, what would be more appropriate here?

leonardo-m commented 4 years ago

in Miri we are now using TryFrom everywhere which makes me feel much better as I know we don't silently truncate anything

But TryFrom in some performance-sensitive places is heavy. So in my code I use a short to!{} macro that performs the TryFrom in debug mode and uses "as" in release mode. And another little macro for FP=>Int casts, that uses "as" in debug mode and to_int_unchecked in release mode.

RalfJung commented 4 years ago

But TryFrom in some performance-sensitive places is heavy. So in my code I use a short to!{} macro that performs the TryFrom in debug mode and uses "as" in release mode.

Yeah I can see how that's useful. For Miri, arithmetic is not our limiting factor in terms of performance. In fact I'd prefer our additions etc. to be overflow-checked in release mode as well.

And another little macro for FP=>Int casts, that uses "as" in debug mode and to_int_unchecked in release mode.

oO okay that's UB when anything goes wrong (i.e., this is an unsafe-to-use macro), I would not accept that in my own codebases without benchmarks for each individual invocation of the macro. ;) But of course, everyone makes their own decisions for these trade-offs.

Btw, IIRC people were looking for feedback on cases where the as float-to-int casts are measurably costing performance -- that should then likely be optimized better so people do not have to resort to unsafe code. See https://github.com/rust-lang/rust/pull/71269#issuecomment-615537137.

leonardo-m commented 4 years ago

I would not accept that in my own codebases without benchmarks for each individual invocation of the macro. ;)

Yes, I use this second fto!{} macro only in special cases, and after benchmarking.

leonardo-m commented 4 years ago

In the meantime I've found about a dozen usage points for this narrowing_rem in my codebase.

leonardo-m commented 4 years ago

The code could be simplified a bit using Self:

fn narrowing_rem(&self, den: u8) -> u8 { (*self % Self::from(den)) as u8 }

But now I'm trying to use a (possibly overengineered) more complex version:

#![feature(core_intrinsics, const_fn, const_panic)]

use core::intrinsics::assume;
use std::num::{NonZeroU8, NonZeroU16};

trait HasNonZero {
    type NonZero;
    fn from_nonzero(v: Self::NonZero) -> Self;
}
impl HasNonZero for u8 {
    type NonZero = NonZeroU8;
    fn from_nonzero(v: Self::NonZero) -> Self { v.get() }
}
impl HasNonZero for u16 {
    type NonZero = NonZeroU16;
    fn from_nonzero(v: Self::NonZero) -> Self { v.get() }
}
//...

//- - - - - - - - - - - - - - - - - - - -

// To be used at compile-time, it raises a panic. TODO until Option::unwrap becomes const.

const fn nonzero_u8(x: u8) -> NonZeroU8 {
    match x {
        0 => panic!("nonzero: x == 0"),
        _ => unsafe { NonZeroU8::new_unchecked(x) },
    }
}

const fn nonzero_u16(x: u16) -> NonZeroU16 {
    match x {
        0 => panic!("nonzero: x == 0"),
        _ => unsafe { NonZeroU16::new_unchecked(x) },
    }
}
//...

//- - - - - - - - - - - - - - - - - - - -

trait NarrowRem<Out> where Out: HasNonZero {
    fn narrowing_rem(&self, den: Out::NonZero) -> Out;
}
impl NarrowRem<u8> for u32 {
    #[inline(always)]
    fn narrowing_rem(&self, den: <u8 as HasNonZero>::NonZero) -> u8 {
        let divisor = Self::from(u8::from_nonzero(den));
        unsafe { assume(divisor != 0); } // TODO, will be unnecessary.
        (*self % divisor) as u8
    }
}
impl NarrowRem<u8> for usize {
    #[inline(always)]
    fn narrowing_rem(&self, den: <u8 as HasNonZero>::NonZero) -> u8 {
        let divisor = Self::from(u8::from_nonzero(den));
        unsafe { assume(divisor != 0); } // TODO, will be unnecessary.
        (*self % divisor) as u8
    }
}
//...
leonardo-m commented 4 years ago

nonzero_uX are now useless, we can define NonZeroUX::new().unwrap() in constants.

leonardo-m commented 3 years ago

Now it can be implemented without unsafe (beside the hard as cast):

#![feature(const_trait_impl)]
#![allow(incomplete_features)]

use std::num::{NonZeroU8, NonZeroU16, NonZeroU32, NonZeroU64, NonZeroUsize, NonZeroU128};

trait HasNonZero {
    type NonZero;
    fn from_nonzero(v: Self::NonZero) -> Self;
}

macro_rules! has_nonzero_helper {
    ($T:ty, $NZT:ty) => {
        impl const HasNonZero for $T {
            type NonZero = $NZT;
            fn from_nonzero(v: Self::NonZero) -> Self { v.get() }
        }
    }
}

has_nonzero_helper!(u8, NonZeroU8);
has_nonzero_helper!(u16, NonZeroU16);
has_nonzero_helper!(u32, NonZeroU32);
has_nonzero_helper!(u64, NonZeroU64);
has_nonzero_helper!(u128, NonZeroU128);
has_nonzero_helper!(usize, NonZeroUsize);

// Performs a rem followed by a safe narrowing cast:
// (T1 % (T2 as T1)) as T2  where sizeof<T2> < sizeof<T1>
trait NarrowRem<Out> where Out: HasNonZero {
    fn narrowing_rem(&self, den: Out::NonZero) -> Out;
}

macro_rules! narrow_rem_helper {
    ($TSelf:ty, $TRem:ty) => {
        impl NarrowRem<$TRem> for $TSelf {
            #[inline(always)]
            fn narrowing_rem(&self, den: <$TRem as HasNonZero>::NonZero) -> $TRem {
                (*self % <Self as HasNonZero>::NonZero::from(den)) as _
            }
        }
    }
}

narrow_rem_helper!(u16, u8);
narrow_rem_helper!(u32, u8);
narrow_rem_helper!(u64, u8);
narrow_rem_helper!(u128, u8);
narrow_rem_helper!(usize, u8);
narrow_rem_helper!(u32, u16);
narrow_rem_helper!(u64, u16);
narrow_rem_helper!(u128, u16);
narrow_rem_helper!(usize, u16);
narrow_rem_helper!(u64, u32);
narrow_rem_helper!(u128, u32);
narrow_rem_helper!(u128, u64);

fn main() {}
leonardo-m commented 3 years ago

A similar narrowing_and could be added.

mkrasnitski commented 9 months ago

Chiming in here, what would it take to simply add support for this to the built in % operator, and subsequently add the corresponding std::ops::Rem implementations?