rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.46k stars 12.6k forks source link

Tracking issue for `Zero`/`One`/`iter_arith` stabilization #27739

Closed aturon closed 8 years ago

aturon commented 9 years ago

We currently have Zero and One traits that are meant to work with Add and Mul and support iterator operations like sum, as well as the current step_by API.

It would be good to have a more comprehensive vision for this kind of trait before stabilization; an RFC would be ideal.

fizyk20 commented 8 years ago

Those traits would be really useful for generic arithmetics, I look forward to them being stabilized.

Also, I have some ideas for traits that would pair up with those nicely, e.g.: Field (meaning Add+Mul+Sub+Div+Neg+One+Zero), Vector<T: Field> (meaning Add+Sub+Mul<T>+Div<T>+Neg+Zero), Ring, Group etc. They would allow marking some types as behaving like fields/vectors/what have you - or would such functionality rather belong in a separate crate?

photino commented 8 years ago

I would like to see these are stabilized. They are really useful and simple enough to have a position in std.

haudan commented 8 years ago

I agree with @photino, I don't see a reason why these traits shouldn't be part of std, and like @fizyk20 said, they'll be very useful in generic arithmetics.

aturon commented 8 years ago

Nominating for 1.6 discussion.

SimonSapin commented 8 years ago

I have multiple times tried to use some_iterator.sum(), only to have the compiler remind me that it’s unstable. Then I typically replace it with something like some_iterator.fold(0, std::ops::Add::add).

Please stabilize sum. I care less about what traits support it.

mitaa commented 8 years ago

There is an open issue with iterator sum and product regarding type inference / default type parameter fallback (#25094)

I'm not sure how this impacts stabilization.

alexcrichton commented 8 years ago

Unfortunately the libs team didn't have time to decided on this in the triage meeting, so this won't enter FCP this cycle.

sfackler commented 8 years ago

I'd like to nominate at least Iterator::sum and Iterator::product for stabilization. I believe we can stabilize those even if we're not sure about the Zero and One traits.

huonw commented 8 years ago

As a general point, these traits don't necessarily cover everything one might want for non-primitives. In particular, Zero and One don't pay attention to ownership and resource reuse, e.g. some_bigint = Zero::zero() will deallocate the existing storage in some_bigint, overwriting it with the new value (maybe with a new allocation, maybe not). For performance, it can be important to reuse allocations, i.e. some_biging.set_zero().

Additionally, for other types, it may not make sense to just ask for Zero or One with no inputs, e.g. a big-float might have a notion of precision, meaning the call wants to be more like ...::zero(precision).

adamcrume commented 8 years ago

I agree that precision is important for big-floats, but I don't think that the notion of precision belongs in Zero. Something like set_zero(&mut self) is more interesting. It might also be nice to include is_zero(&self), because x == y where x and y are zero big-floats with different precision might return false, and because it may be possible to implement x.is_zero() to be faster than x == X.zero().

SimonSapin commented 8 years ago

If I’m grepping correctly, Iter::sum, Iter::product and Range::step_by are the only users of Zero or One in std’s public API. These literally only need to know:

Since we’ve already decided to reduce the scope of std::num, more features like in-place zeroing or dealing with precision probably belong in the num crate instead.

(Edited to remove truncated paragraph, it was the same as the previous one.)

alexcrichton commented 8 years ago

:bell: This issue is now entering its final comment period for stabilization :bell:

At the libs triage meeting we discussed that we may want to somewhat aggresively pursue stabilization of the methods here, and if it comes to it we can punt on stabilizing the traits involves (as we've done with other APIs in the past).

@aturon however I believe would like to discuss stabilizing the One and Zero traits themselves, so we can also discuss stabilizing them during this FCP.

tl;dr; This FCP is primarily for stabilizing the sum and product methods, but it's also possible to stabilize the One and Zero traits.

bluss commented 8 years ago

The type parameter default should be removed before stabilization, it has no effect, and the lang team seems to unfortunately doubt the future of the feature altogether.

bluss commented 8 years ago

@SimonSapin Your comment was cut short, but I think I agree. Zero and One are out of place as the only numeric traits in libstd. That said, it's good for everyone if One and Zero are either stabilized or removed. Unstable limbo just leads to the tricky conflicts of std's Zero vs num's Zero.

bluss commented 8 years ago

Zero needs a migration story. Num's Zero and libstd Zero are not compatible, and we need to make sure the transition is smooth.

aturon commented 8 years ago

@bluss Good catch. For the record, the difference is that num's Zero and One traits inherit from Add and Mul respectively, whereas std's don't.

Given that we want a direct connection between e.g. Zero and Add, I think inheritance makes sense, so we can make the traits match.

That issue aside, the only real question with Zero/One is: should we commit to having them in std? I think that the sum and product methods should live or die by this question: stabilizing them without eventually stabilizing the traits doesn't make sense. So to me, Zero/One stabilization is the real question here.

I agree with @SimonSapin's assessment. I also think that these traits are relatively harmless -- they are effectively specialized versions of Default. They would not be in the prelude, so I don't worry too much about conflict with a more general/powerful external numeric hierarchy.

So overall, I think I'm :+1: on stabilization of all items in question (modulo adjusting to match the num crate).

sfackler commented 8 years ago

If we have concerns about future compatibility with more robust numeric trait hierarchies, we can always make these traits specifically for sum and product - maybe even have them sitting in the iter module and rename them.

SimonSapin commented 8 years ago

How about, in std::iter:

trait Sum: Add<Rhs=Self> {
    /// Sum of an empty sequence, i.e. "zero".
    fn neutral_element() -> Self;
}
trait Product: Mul<Rhs=Self> {
    /// Product of an empty sequence, i.e. "one".
    fn neutral_element() -> Self;
}
bluss commented 8 years ago

@aturon I wasn't even aware of the differences in their definition. What I meant by num's Zero and std's Zero are not compatible, is simply that if my crate requires num::Zero but you implemented std::Zero, then your type does not fit the bound. Ideally we'd remove that issue, somehow.. If we can't bridge, then mya crate can never switch to std::Zero for fear of breaking deps.

aturon commented 8 years ago

@SimonSapin @sfackler I'm not sure I see the appeal of defining these traits so narrowly, or in iter. If we're going to have them, we may as well keep them with their more general name and placement.

@bluss I see. So, in general, we'd like to handle this by changing num to just pub use the definitions from std (which is why they need to align), but for that to work smoothly it needs to be cfg-ed on which version/what stable features are available.

BurntSushi commented 8 years ago

I would be somewhat hesitant about stabilizing Zero and One, mostly for the reasons that @huonw brought up. Namely, they may not be appropriate for all types with obvious Zero/One values. I think that means that, for example, other crates would need to end up defining their own Zero/One traits, which would seem kind of unfortunate. Nevertheless, I feel like that's a pretty minor point against them, and I think that would be my only cause for hesitation other than "we said we weren't defining numeric hierarchies in std."

With that said, if we think stabilizing Zero/One (or eventually stabilizing them) is what makes sense if we stabilize sum/product, then I say go for it. I share @SimonSapin's vigor for sum.

SimonSapin commented 8 years ago

Is it possible to implement sum and product for iterators of the 10 primitive numeric types without a trait?

alexcrichton commented 8 years ago

The libs team discussed this in triage recently and the decision was to not stabilize nor deprecate for now, but remove this issue from FCP.

One concern brought up is that if we were to stabilize the methods and not the bounds, we may still not control the set of types that can be used with the methods. For example the output must implement Zero or One, which means we're in control of that, but the other bound is just Add, a stable trait, which can be implemented for local types:

impl Add<MyLocalType> for i32 {
    type Output = i32;
    // ...
}

This means that we couldn't necessarily backwards-compatibly tweak the bounds.


The libs team was also unsure of how we would ever stabilize these traits, which leads us to be wary of stabilizing even the methods without a possible path forward. It's somewhat clear to us that we don't want a numeric hierarchy in the standard library right now (or the beginnings of one), and other possibilities for this are perhaps:

// added to prelude
trait Sum {
    type Output;
    fn sum(self) -> Output;
}

// or 

trait Sum {
    type Output;
    fn sum<I: Iterator>(i: I) -> Output;
}

trait Iterator {
    fn sum<S: Sum>(self) -> S::Output where SomeCrazyWhereClause { 
        S::sum(self) 
    }
}

These options don't seem great, however.

As a result, the libs team has decided to hold off on this for now until we can get the bounds in a shape that may be stable one day.

PeterHatch commented 8 years ago

What about renaming Zero and One to DefaultZero and DefaultOne?

They're only defining the function needed to work as a default, and some types that have multiple zero or one values may not have a reasonable default, as the big-float example of needing precision specified shows.

It'd be a breaking change to num to get it integrating nicely with DefaultZero and DefaultOne – moving the zero() function out of its Zero trait – but the same change would allow it to work with a type like big-float that implements the Zero trait but not DefaultZero.

It's only a cosmetic change, but I think it makes it clear it's not the start of a numeric hierarchy, and provides a migration story for num.

Luthaf commented 8 years ago

About Zero and One, something worth mentioning is that they should be defined for smart pointer, i.e. impl<T: Zero> Zero for Box<T>, impl<T: Zero> Zero for Cell<T> ...

I ran into this issue when trying to put Cell<Complex> in a ndarray, which have a Zero bound for the data. Complex was Zero, but Cell<Complex> was not, and there was no way for me to implement it (except creating a new type).

Also, if std::Zero goes for compatibility with num::Zero, the is_zero function should have a default implementation.

SimonSapin commented 8 years ago

It sounds like there’s a lot of questions and features around numerical one and zero that we’re not ready to stabilize in std, but it is desirable to stabilize Iterator::sum and Iterator::product. So here is a pre-RFC:

The new traits mostly exist to support the iterator methods, which are still the more convenient API to use compared to calling e.g. ::std::…::Sum::sum(some_iter) directly.

This is effectively "duplicating" the logic of sum/product instead of having one generic algorithm based on more primitive concepts (addition, multiplication, and their neutral elements) but:

How does this sound?

photino commented 8 years ago

I like @SimonSapin's proposal of the new traits Sum and Product, since the sum and product of an iterator may not rely on the zero and one element. For example, I can define a new type as a wrapper of the range 1..n, whose sum can be calculated from the formula n(n-1)/2 directly. Besides, an empty iterator can also return a nonzero value of the type in some cases.

Another proposal: we can remove .sum() and .product(), since they can be replaced by .fold().

bluss commented 8 years ago

@SimonSapin It's a well thought out proposal. I'll explain a recent observation related to drawbacks of per-type implementations like that.

If .sum() is defined in terms of Add + Zero*, then it means that in my present generic code which is written for the floating point types, I use T: Float, and Add + Zero are a subset of num's Float, so I can use .sum() directly. No hassle.

* The asterisk is the fact that num Zero and std's Zero don't line up yet, so this is not exactly the case.

With per type implementations, Sum becomes a new trait bound that my generic code must have. T: Float is not enough, it needs to be T: Float + Sum (until the Float trait is upgraded).

nodakai commented 8 years ago

I have a small frustration on not being able to uniquely type vec![0f64].iter().sum() without an annotation: ...sum::<f64>() @SimonSapin 's proposal will eliminate that annoyance, but I wonder, at what cost?

As @photino pointed out, we have a spectrum of functions with varying degrees of flexibility. At one end lies Iter::fold(), and per-type sum() at the other end. Iter::fold() and the current Iter::sum() can be used to do something like "foo" + 1 + 2 to construct "foo12", in principle. Do we want to retain such a "flexibility"?

Also, let me point out several utility functions such as cvt() under libstd/sys generically check if an argument is zero, nonzero or -1. I'm not sure how to write such functions without relying on Zero and One traits.

SimonSapin commented 8 years ago

@bluss That’s a good point. My proposal is based on the premise that std would rather avoid having a Zero trait at all (so that there is no need to support everything people might want from a numerical zero), but maybe that’s not the right thing to do. I don’t know.

@nodakai Ah, I did not realize when writing my previous message that the return type of Iterator::sum() is not necessarily the same as iterator items. If we want to preserve that flexibility, that’s still possible:

trait Sum<Item> where Self: Sized + ::std::ops::Add<Item, Output=Self> {
    fn sum<I: Iterator<Item=Item>>(iter: I) -> Self;
}

pub trait Product<Item> where Self: Sized + ::std::ops::Mul<Item, Output=Self> {
    fn product<I: Iterator<Item=Item>>(iter: I) -> Self;
}

This is starting to look less nice. But as you say, if we stick with the simpler variation Iterator::fold is always there if more flexibility is required.

bluss commented 8 years ago

I think that as long as libstd does not want to have traits for numerics, these iterators need to wait as well. Num crate would be a good place to put them.

SimonSapin commented 8 years ago

@bluss I disagree. IMO the common case is summing primitive numeric types. std doesn’t need to block on having a fully generic numeric hierarchy to provide that.

dhardy commented 8 years ago

Can I repeat @SimonSapin 's earlier suggestion (with adjusted terminology)? Mathematically, a binary operator on a set normally has an "identity" element (see Groups on Wikipedia). This adds the bare minimum to make sum and product work in a fairly generic manner. It does not address other uses of Zero (where it sounds more like it is used as a default element or special return value, not an identity element).

The only limitation I see is that re-using these for Zero and One in the num crate would require Add or Mul, but currently the respective trait is required anyway.

/// An additive group
trait AddGroup: Add<Rhs=Self> {
    /// Element which can be added without changing the result, i.e. "zero".
    fn identity() -> Self;
}
/// A multiplicitive group
trait MulGroup: Mul<Rhs=Self> {
    /// Element which can be multiplied by without changing the result, i.e. "one".
    fn identity() -> Self;
}
bluss commented 8 years ago

@SimonSapin Ok but a full numeric hierarchy is not needed. One, Zero is much better than ad-hoc traits. PrimitiveInt, PrimitiveNum, PrimitiveFloat or something like that is not a numeric hierarchy either and would be much better than ad-hoc traits too.

Jimmio92 commented 8 years ago

I used this to allow returning unit direction vectors from functions in my vec module. zero(), one(), left(), right(), up(), down(), forward(), backward() wouldn't be possible without these traits. (Maybe if casting to T from 0 and 1 was allowed, these wouldn't be necessary in my use case?)

dhardy commented 8 years ago

@Jimmio92 Your library could use the num crate instead. Alternatively you could probably define your own crate. I don't think your use-case is a good rationale for having "zero" and "one" defined in the standard library.

benaryorg commented 8 years ago

To quote @SimonSapin:

I have multiple times tried to use some_iterator.sum(), only to have the compiler remind me that it’s unstable. Then I typically replace it with something like some_iterator.fold(0, std::ops::Add::add).

Please stabilize sum. I care less about what traits support it.

Couldn't we just stabilize sum and product to take an argument besides self for the base value? It would be much like fold but more readable and One and Zero could therefore be removed as they are only used by sum and product.

This would also make it possible to set the precision of the base value as criticized by some of you.

cuviper commented 8 years ago

Another possibility is to return Option like min/max, and the caller can use sum().unwrap_or(0). But if most callers will want that, it's not nice to be wordier with unwraps.

benaryorg commented 8 years ago

Still wouldn't solve the problem of the mentioned "enhanced" constructors of some types that might need to provide more information for their internals to actually perform an addition.

alexcrichton commented 8 years ago

🔔 This issue is now entering its final comment period 🔔

The libs team was unfortunately a little late on stabilizations this cycle, but we're thinking of probably doing a backport of stabilizations partway through the beta cycle. The API we're considering for stabilization is the one proposed by @SimonSapin here which follows the collect + FromIterator pattern and precludes having a "numerical tower" hierarchy in the standard library.

alexcrichton commented 8 years ago

Ah one question we did have when discussing this was what to do about overflow. We no longer really have the convenience of "overflows as if you wrote the code locally" so we'll have to define semantics one way or another.

SimonSapin commented 8 years ago

The API we're considering for stabilization is the one proposed by @SimonSapin here which follows the collect + FromIterator pattern and precludes having a "numerical tower" hierarchy in the standard library.

I’ve just realize that this design would allow implementations to use an algorithm different from fold. For example an accurate sum of floating point numbers that avoids loss of precision by tracking multiple intermediate sums:

https://docs.python.org/3/library/math.html#math.fsum https://code.activestate.com/recipes/393090/

But is this desirable? It looks like that algorithm allocates memory, so there’s a run-time cost to get that accuracy. This may be a trade-off better left to users to decide. Python itself provides fsum separately from sum.

So I think such an accurate sum would be better as a separate API, and it could be on crates.io at least at first. I think that the impls in std of theses traits should simply use fold with Add or Mul, and the traits documentation should recommend that other impls do the same.

This also deals with overflow semantics, doesn’t it? Leave it per-impl, but give a recommendation (and follow it in std).

ruuda commented 8 years ago

But is this desirable? It looks like that algorithm allocates memory, so there’s a run-time cost to get that accuracy.

You can get a significant precision win with just one extra float for which no heap allocation is required (at the cost of more arithmetic operations). But I wouln’t expect the standard library sum to do that if I didn’t ask for it explicitly. The user might care more about performance than precision, and in that case the extra operations would be undesirable. I agree that it would be better left as a separate function.

benaryorg commented 8 years ago

Someone did actually mention the overflow problem.

I currently have an open issue at the rfcs repo about checked (and the like) methods for Duration which made me think of the very same problem here: will there be a checked_sum method or even a trait for that? Or will I need to catch the unwinding in case of a panic?

@cuviper suggested:

Another possibility is to return Option like min/max, and the caller can use sum().unwrap_or(0). But if most callers will want that, it's not nice to be wordier with unwraps.

So we had something like:

// within Iterator
fn sum<S>(self) -> Option<S>
    where S: Add<Self::Item, Output=S>

This would also solve the overflowing problem (although it could return Result too to have distinct errors for "no elements" and "overflow").

The unwrap_or() solution would not solve the problem of having different default elements as with big floats or similar.

If for any reason you want to not use the Option version I'd support @SimonSapin's solution, even though I'd want to see some sort of overflow-checking version.

dhardy commented 8 years ago

This would also solve the overflowing problem (although it could return Result too to have distinct errors for "no elements" and "overflow").

I would expect vec![].iter().sum() == Some(0).

Using None like a float's NAN to indicate overflow seems reasonable from the API usage perspective.

I don't see a huge difference between Simon's last suggestion and my adjustment to his earlier suggestion in terms of how much special-purpose code needs to get added to the standard library.

Is there another possibility, to use a trait-based fold design?

trait Foldable<T, R> {
    fn initial() -> R;
    fn fold(x: T, r: R) -> Option<R>;
}

This may allow implementations like Sum and Prod to be defined in another library and a generic iterator function fn fold_with<T,R>(self, fold_t: &Foldable<T, R>) -> Option<R>. Usage would be something like:

use fold_impls::Sum;

fn main() {
    let v = vec![1, 2, 3, 4];
    println!("Sum: {}", v.fold_with(Sum).unwrap());
}
benaryorg commented 8 years ago

Sorry, no I meant something different and accidentally hit Ctrl while trying to get a newline somewhere, sorry for that.

First of all, current feature gated iter_arith does not have any methods for Vec<T> but only for Iterator, that means you would have to use v.iter() or v.into_iter().

This also means that your

fn fold_with<T,R>(&mut self, fold_t: &Foldable<T, R>) -> Option<R>

Should actually take ownership of the value.

Edit: current fold looks like this:

fn fold<B, F>(self, init: B, f: F) -> B where F: FnMut(B, Self::Item) -> B { ... }

Furthermore, could you please explain what advantages your fold_with has over just providing some FnMut, as used by fold and an initial value?

Just use-ing it and then plugging it into the Iterator is fast to write but couldn't other crates just simply define their "special arithmetic operations" as traits and implement them for Iterator themselves?

Sorry, it might just be me not seeing the point there.

benaryorg commented 8 years ago

Complete enough for you?

I'd be happy if you added the where clauses to your trait, where applicable, or provided some sort of reference implementation that just/at least covered the signature of such a type.

Also how would that:

v.fold_with(sum)

work if you take fold_t using an immutable borrow?

fn fold_with<T,R>(&mut self, fold_t: &Foldable<T, R>) -> Option<R>

Wouldn't it be more like

extern crate something;

fn main() {
    let v: Vec<usize> = vec![1,2,3];
    // just to have some unified way to get such a structure `new` is used
    // because sometimes there might be an internal state to hold (for caching, or whatever reasons)
    // if this was not desirable one could simply have `let sum = something::sum::Sum;` I think
    let sum = something::sum::Sum::new();
    println!("Sum: {}", v.iter().fold_with(&sum).unwrap());
}

Edit: I just saw you corrected the use fold_impls::Sum; to be uppercase, now it makes more sense (see the comment for the unit-struct), still the immutable borrow is confusing.

dhardy commented 8 years ago

Thanks for your comments.

@benaryorg The reference to sum can be immutable since sum is never in fact used (except to dynamic dispatch to the right methods).

Maybe you're right that an object sum needs to be created for the type Sum implementing the trait Foldable. I've run into that limitation before. Seems silly since none of the trait functions reference the object (self).

The potential advantage of the trait is to specify two related things together (both functions of the same trait impl), but maybe it's not worth it.

alexcrichton commented 8 years ago

Note that at least in the past when the libs team has discussed these APIs the thought is that if we don't have the ergonomics that we have today then the methods probably aren't worth it. The sum/product methods are basically nice conveniences, and you can always write .fold(0, |a, b| a + b) yourself instead of sum if you really want.

Along those lines I don't think we want to introduce new methods like fold_with, and we've also been hesitant to attempt to over-abstract this with something like a Foldable trait or to return an Option to use the Add trait.

The thinking is that using a Sum and Product trait buys us the ergonomics we have today as well as maximal flexibility in terms of how these can be implemented. They may not always be the most ergonomic or most general to implement, but it's tough to find an alternate solution with the same level of ergonomics and flexibility.

benaryorg commented 8 years ago

I'll stick to @SimonSapin's solution then.

One can still check for overflows by invoking fold manually:

v.iter().fold(Some(0),|a,&b|a.and_then(|a: usize|a.checked_add(b)))