rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.13k stars 12.69k forks source link

Tracking issue for clamp RFC #44095

Closed Xaeroxe closed 3 years ago

Xaeroxe commented 7 years ago

Tracking issue for https://github.com/rust-lang/rfcs/pull/1961

PR here: #44097 #58710 Stabilization PR: https://github.com/rust-lang/rust/pull/77872

TODO:

EdorianDark commented 5 years ago

@Xaeroxe Thank you for the implementation.

richard-uk1 commented 5 years ago

Maybe this could go in the next edition, if it will break stable code?

jturner314 commented 5 years ago

One thing that IMO is worth considering in more detail is one-sided clamping of f32/f64. The discussion seems to have touched on this topic briefly but not really considered it in detail.

In most cases, if the input to a one-sided clamp is NAN, it's more useful for the result to be NAN than for the result to be the clamping boundary. So, the existing f32::min and f64::max functions don't work well for this use-case. We need separate function(s) for one-sided clamping. (See rust-num/num-traits#122.)

The reason why I bring this up is that it affects the design of the two-sided clamp, since it would be nice for two-sided and one-sided clamps to have a consistent interface. A few options are:

  1. input.clamp(min, max), input.clamp_min(min), and input.clamp_max(max)
  2. input.clamp(min..=max), input.clamp(min..), input.clamp(..=max)
  3. input.clamp(min, max), input.clamp(min, std::f64::INFINITY), input.clamp(std::f64::NEG_INFINITY, max)

With the current implementation (min and max as separate f32/f64 parameters), we would have to choose option 1, which I think is perfectly reasonable, or option 3, which IMO is too verbose. We just should be aware that the sacrifice is having to add separate clamp_min and clamp_max functions or requiring the user to write out positive/negative infinity.

It's also worth noting that we could provide

impl f32 {
    pub fn clamp<T>(self, bounds: T) -> f32
    where
        T: RangeBounds<f32>,
    {
         // ...
    }
}

// and for f64

since for f32/f64 we actually know how to handle exclusive bounds, unlike for general Ord. Of course, then we'd probably want to change Ord::clamp to take a RangeInclusive argument for consistency. It seems like there was not a strong opinion either way on whether to prefer two arguments or a single RangeInclusive argument for Ord::clamp.

If this is already a settled issue, please feel free to dismiss my comment. I just wanted to bring these things up because I didn't see them in earlier discussion.

SimonSapin commented 5 years ago

Triage: the APIs below currently are unstable and point here. Are there issues to consider other than NaN handling? Is it worth stabilizing Ord::clamp first without blocking it on NaN handling?


pub trait Ord: Eq + PartialOrd<Self> {
    // …
    fn clamp(self, min: Self, max: Self) -> Self where Self: Sized {…}
}
impl f32 {
    pub fn clamp(self, min: f32, max: f32) -> f32 {…}
}
impl f64 {
    pub fn clamp(self, min: f64, max: f64) -> f64 {…}
}
Xaeroxe commented 5 years ago

@SimonSapin I'd be happy to stabilize the whole thing personally

scottmcm commented 5 years ago

+1, this went through a full RFC and I don't think there's anything material that's come up since then. For example, NaN handling came up in detail on IRLO and in the RFC discussion.

SimonSapin commented 5 years ago

Alright, that sounds fair enough.

@rfcbot fcp merge

rfcbot commented 5 years ago

Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

m-ou-se commented 5 years ago

Was there a decision made about x.clamp(7..=13) vs x.clamp(7, 13)? https://github.com/rust-lang/rust/issues/44095#issuecomment-533764997 mentions the first one might be better for consistency with a potential future f64::clamp.

kennytm commented 5 years ago

@m-ou-se See the discussion starting from https://github.com/rust-lang/rust/issues/44095#issuecomment-411457313 and also https://github.com/rust-lang/rust/pull/58710#pullrequestreview-207529970.

CryZe commented 5 years ago

I'd say that's quite an unfortunate resolution as .min and .max frequently cause bugs as you use .min(...) to specify the upper bound and .max(...) to specify the lower bound. This is incredibly confusing and I've seen so many bugs with this. .clamp(..1.0) and .clamp(0.0..) are just so much clearer.

richard-uk1 commented 5 years ago

@CryZe makes a very good point: even if you never make a mistake with min = upper bound, max = lower bound, you still have to do mental gymnastics to remember which to use. This cognitive load would be better spent on whatever problem you are trying to solve.

I know x.clamp(y, z) is more expected, but maybe this is an opportunity to innovate ;)

Xaeroxe commented 5 years ago

I experimented quite a bit with ranges in the early stages, and even delayed the RFC by several months so we could experiment with inclusive ranges. (This was started before they were stabilized)

I discovered that it was not possible to implement clamp for exclusive ranges on floating point numbers. Only supporting some range types but not others was too surprising of an outcome, so even though I'd delayed the RFC several months to experiment with it this way I ultimately decided ranges were not the solution.

varkor commented 5 years ago

@m-ou-se See the discussion starting from #44095 (comment) and also #58710 (review).

Edit: as pointed out below, the discussion in the pull request (#58710) contains more discussion of the design decision than the tracking issue. It's unfortunate that this wasn't communicated here, which is where design discussions typically take place, but it was discussed.

Only supporting some range types but not others was too surprising of an outcome

Rust already treats some ranges differently than others (e.g. using them for iteration), so only permitting some ranges as clamp arguments doesn't seem surprising to me at all.

Xaeroxe commented 5 years ago

Here's the most helpful analysis of it: https://github.com/rust-lang/rfcs/pull/1961#issuecomment-302600351

richard-uk1 commented 5 years ago

@Xaeroxe Only supporting some range types but not others was too surprising of an outcome

If you were thinking about this before they were stabilized, has time and general usage changed your opinion, or do you think it is still the case?

I would argue that exclusive ranges should never be implemented for floats anyway, because they would have different behaviour to integers (the range 0..10 includes the lower bound and excludes the upper bound, so why should the hypothetical range 0.0...10.0 exclude both?). I don't think it would be surprising, at least for me.

@varkor But this was then changed after a single comment in the review, without any discussion on the tracking issue.

This might come across as overly confrontational, try something like "when I looked through the conversation I didn't find a convincing argument as to why we shouldn't use ranges, can someone point me to it?".

I suspect the argument you are looking for is here: https://github.com/rust-lang/rfcs/pull/1961#issuecomment-302600351

EDIT @Xaeroxe beat me to it :)

Xaeroxe commented 5 years ago

has time and general usage changed your opinion

Thus far it has not, but ranges are something I use pretty infrequently in my daily coding. I'm open to being persuaded by code examples and existing APIs with partial range support. However, even if we resolve that question there are still several other excellent points scottmcm brings up in the RFC comment that need to be addressed. For example, Step is not implemented on as many types as Ord, is this minor syntactical change worth losing those types? Furthermore, is there a use case for non inclusive range clamps? As far as I can tell, no other language or framework felt the need to support exclusive range clamp, so what benefit do we gain from it? Ranges were much more difficult to implement in a satisfactory way, and came with many downsides and few benefits.

Xaeroxe commented 5 years ago

If I were to implement this using ranges, it'd look like this.

So there's a few reasons I think we shouldn't go with this approach.

  1. The selection of ranges needed is novel enough as to require a new trait, and specifically excludes the most common range, Range.

  2. We're already this far along in the RFC process and the only thing std gains from this is yet another way to write .max() or .min(). I don't really want to set back the RFC to the beginning of the process in order to implement something we can already do in Rust.

  3. It doubles the amount of branching happening in the function in order to accommodate a use case we're still not sure exists. I can't get this to show up in benchmarks.

jturner314 commented 5 years ago

Need for one-sided clamping operations

... the only thing std gains from this is yet another way to write .max() or .min().

The primary point I was trying to make is that I've seen this apparent equivalence between .min()/.max() and one-sided clamps come up multiple times in the discussion, but the operations are not equivalent for floating point numbers in the way they should handle NAN.

For example, consider input.max(0.) as an expression to clamp negative numbers to zero. If input is non-NAN, it works fine. However, when input is NAN, it evaluates to 0.. This is almost never the desired behavior; one-sided clamping should preserve NAN values. (See this comment and this comment.) In conclusion, .max() works well for taking the larger of two numbers, but it does not work well for one-sided clamping.

So, we need one-sided clamping operations (separate from .min()/.max()) for floating-point numbers. Others made good arguments for the utility of one-sided clamping operations for non-floating-point types too. The next question is how we want to express those operations.

How to express one-sided clamping operations

.clamp() with INFINITY

In other words, don't add a one-sided clamping operation; just tell users to use .clamp() with INFINITY or NEG_INFINITY bounds. For example, tell users to write input.clamp(0., std::f64::INFINITY).

This is very verbose, which will push users to use the incorrect .min()/.max() if they aren't aware of the nuances of NAN handling. Additionally, it doesn't help for T: Ord, and IMO it's less clear than the alternatives.

.clamp_min() and .clamp_max()

One reasonable option is to add .clamp_min() and .clamp_max() methods, which wouldn't require any changes to the currently proposed implementation. I think this is a reasonable approach; I just wanted to make sure that we were aware that we'd have to use this approach if we stabilize the currently proposed implementation of clamp.

Range argument

Another option is make clamp take a range argument. @Xaeroxe has shown one way to implement this, but that implementation does have a few disadvantages, as he mentioned. An alternative way to write the implementation is similar to the way slicing is currently implemented (the SliceIndex trait). This resolves all the objections I've seen in the discussion except for the concern about providing implementations for a subset of range types and the additional complexity. I agree that it adds some complexity, but IMO it's not much worse than adding .clamp_min()/.clamp_max(). For Ord, I'd suggest something like this:

pub trait Ord: Eq + PartialOrd<Self> {
    // ...

    fn clamp<B>(self, bounds: B) -> B::Output
    where
        B: Clamp<Self>,
    {
        bounds.clamp(self)
    }
}

pub trait Clamp<T> {
    type Output;
    fn clamp(self, input: T) -> Self::Output;
}

impl<T> Clamp<T> for RangeFull {
    type Output = T;
    fn clamp(self, input: T) -> T {
        input
    }
}

impl<T: Ord> Clamp<T> for RangeFrom<T> {
    type Output = T;
    fn clamp(self, input: T) -> T {
        if input < self.start {
            self.start
        } else {
            input
        }
    }
}

impl<T: Ord> Clamp<T> for RangeToInclusive<T> {
    type Output = T;
    fn clamp(self, input: T) -> T {
        if input > self.end {
            self.end
        } else {
            input
        }
    }
}

impl<T: Ord> Clamp<T> for RangeInclusive<T> {
    type Output = T;
    fn clamp(self, input: T) -> T {
        assert!(self.start <= self.end);
        let mut x = input;
        if x < self.start { x = self.start; }
        if x > self.end { x = self.end; }
        x
    }
}

Some thoughts on this:

Basically, this approach allows us as much flexibility as we want with regards to trait bounds. It also makes it possible to start with a minimally useful set of Clamp implementations, and add more implementations later without breaking changes.

Comparing the options

The "use .clamp() with INFINITY" approach has substantial disadvantages, as mentioned above.

The "existing .clamp" + .clamp_min() + .clamp_max() approach has the following disadvantages:

The range argument approach has the following disadvantages:

I don't have a strong opinion on the "existing .clamp" + .clamp_min() + .clamp_max() approach versus the range argument approach. It's a trade-off.

richard-uk1 commented 5 years ago

@Xaeroxe It doubles the amount of branching happening in the function in order to accommodate a use case we're still not sure exists. I can't get this to show up in benchmarks.

Maybe the extra branch will be optimized away by LLVM?

scottmcm commented 5 years ago

On one-sided clamping

Because clamping is inclusive on both sides, one can just specify the min/max on the left/right to get the one-side-only clamping behaviour. I think that's perfectly acceptable, and arguably nicer than .clamp((Bound::Unbounded, Inclusive(3.2))) where there isn't a Range* type that fits anyway:

x.clamp(i32::MIN, 10);
x.clamp(-f32::INFINITY, 10.0);

There's no perf loss, as LLVM is trivially able to optimize the dead side away: https://rust.godbolt.org/z/l_uBLO

kornelski commented 5 years ago

Range syntax would be cool, but clamp is basic enough that two separate args are fine and easy to understand.

Maybe min/max NaN handling can be fixed by itself, e.g. by changing implementation of f32's inherent methods? Or specializing the PartialOrd::min/max? (with an edition flag, assuming Rust manages to find a way to toggle things in libstd).

Xaeroxe commented 5 years ago

@scottmcm you should check out RangeToInclusive.

After pondering this some more it's occurred to me that stable is forever, so we shouldn't consider "resetting the RFC process" to be a reason not to make a change.

To that end, I want to go back to the mindset I had when implementing this. Clamp conceptually operates over a range, so it makes sense to use the vocabulary Rust already has in place for expressing ranges. That was my knee jerk reaction, and it seems to be the reaction for many other people. So let's re-iterate the arguments for not doing it this way, and see if we can refute them.

Now that we've refuted the reasons not to do this, this comment lacks any kind of motivation to make the change, as the options appear equivalent. Let's find some motivation to make the change. I've only got one reason, which is that the general opinion in this thread seems to be that the range based approach improves semantics of the language. Not just for the inclusive doubled ended range clamp, but also for functions like .min() and .max().

I'm curious if this line of thought has any traction with others who are in favor of stabilizing the RFC as is.

EdorianDark commented 4 years ago

I think it would be better to leave Clamp in the current form, because it is now very similar to other languages. When I worked on my pull request #58710 I tried to use a Range based implementation. But rust-lang/rfcs#1961 (comment)) convinced me, that it is better in the standard form.

OptimisticPeach commented 4 years ago

I think that it would be logical to have a #[must_use] attribute on the function, so as to not confuse people who are not used to how rust numerics work. That is to say, I could easily perceive someone writing the following (incorrect) code:

let mut x: f64 = some_number_source();
x.clamp(0.0, 1.0);
//Proceeds to assume that 0.0 <= x <= 1.0

In general, rust takes a (number).method() approach to numerics (whereas other languages use Math.Method(number)), but even when keeping this in mind, it would be a logical assumption that this could modify number. This is more of a quality of life thing than anything.

EdorianDark commented 4 years ago

The [must_use] attribute was added recently. @ Xaeroxe Did you come up with something for range based clamp? I think that the function as it is right now would best fit to the other numeric functions of rust and would like to start stabilizing it again.

Xaeroxe commented 4 years ago

At this moment I don't see any reason to go with a range based clamp. Yes, let's add the must_use attribute and work towards stabilization.

EdorianDark commented 4 years ago

@SimonSapin @scottmcm Could we restart the stabilization process?

noonien commented 4 years ago

As @jturner314 said, it would be great to have clamp on PartialOrd, instead of Ord, so it can also be used on floats.

kennytm commented 4 years ago

We have the specialized f32::clamp and f64::clamp in this issue already.

noonien commented 4 years ago

This is what I'm trying to do:

use num_traits::float::FloatCore;

struct Foo<T> (T);

impl<T: FloatCore> Foo<T> {
    fn foo(&self) -> T {
        self.0.clamp(1, 10)
    }
}

fn main() {
    let foo = Foo(15.3);
    println!("{}", foo.foo())
}

Link to playground.

kornelski commented 4 years ago

PartialOrd is not a float-only trait. Having float-specific method doesn't make clamp available for custom PartialOrd types.

Current implementation requires Eq, even though it doesn't use it.

Xaeroxe commented 4 years ago

The major concern with PartialOrd was that it provides weaker guarantees, which in turn weakens the guarantees of clamp. Users wanting this to be on PartialOrd may be interested in another function I wrote https://docs.rs/num/0.2.1/num/fn.clamp.html

kornelski commented 4 years ago

What are these guarantees?

jdahlstrom commented 4 years ago

A fairly natural expectation is that iff x.clamp(a, b) == x then a <= x && x <= b. This is not guaranteed with PartialCmp where x may be incomparable with either a or b.

BartMassey commented 4 years ago

Came here today looking for vaguely-remembered clamp() and read the discussion with interest.

I would suggest using the "option trick" as a compromise between allowing arbitrary ranges and having several named functions. I know this is not popular with some, but it seems to capture the desired semantics nicely here:

#![allow(unstable_name_collisions)]

pub trait Clamp: Sized {
    fn clamp<L, U>(self, lower: L, upper: U) -> Self
    where
        L: Into<Option<Self>>,
        U: Into<Option<Self>>;
}

impl Clamp for f32 {
    fn clamp<L, U>(self, lower: L, upper: U) -> Self
    where
        L: Into<Option<Self>>,
        U: Into<Option<Self>>,
    {
        let below = match lower.into() {
            None => self,
            Some(lower) => self.max(lower),
        };
        match upper.into() {
            None => below,
            Some(upper) => below.min(upper),
        }
    }
}

#[test]
fn test_clamp() {
    assert_eq!(1.0, f32::clamp(2.0, -1.0, 1.0));
    assert_eq!(-1.0, f32::clamp(-2.0, -1.0, 1.0));
    assert_eq!(1.0, f32::clamp(2.0, None, 1.0));
    assert_eq!(-1.0, f32::clamp(-2.0, -1.0, None));
    assert_eq!(2.0, f32::clamp(2.0, -1.0, None));
    assert_eq!(-2.0, f32::clamp(-2.0, None, 1.0));
}

If this were included in std a blanket implementation could also be included for T: Ord, which would cover the concerns raised about a general PartialOrd implementation.

Given that defining a clamp() function in user code currently generates a compiler warning about unstable name collisions by default, I think the name "clamp" is fine for this function.

EdorianDark commented 4 years ago

I think, that clamp(a,b,c) should behave the same was as min(max(a,b), c). Since max and min are not implemented for PartialOrd neither should clamp. The issue with NaN was already discussed.

noonien commented 4 years ago

@EdorianDark i agree. min, max should also only require PartialOrd.

EdorianDark commented 4 years ago

@noonien min and max are defined since Rust 1.0 and they require Ord and have a definition for f32 and f64. This is not the right place to discuss these function. Here we can can only take care that min, max and clamp behave comparable and not surprisingly. Edit: I don't like the situation with PartialOrd and would rather have float implement Ord, but this is is not possible to change anymore after 1.0.

Xaeroxe commented 4 years ago

This has been merged and unstable for about a year and a half now. How do we feel about stabilizing this?

EdorianDark commented 4 years ago

I would love stabilizing this!

KamilaBorowska commented 4 years ago

If clamp method name conflict sounds like an issue, I suggested changing name resolution at one point in https://github.com/rust-lang/rust/pull/66852#issuecomment-561667812, and it would help with this too.

matklad commented 4 years ago

@Xaeroxe I think the process is to submit stabilization PR and ask for libs team consensus on that. It seems that t-libs is overloaded and can't keep up with non-fcped things.

Xaeroxe commented 4 years ago

Okay, did that https://github.com/rust-lang/rust/pull/77872

kennytm commented 4 years ago

@matklad actually an FCP proposal already started last year at https://github.com/rust-lang/rust/issues/44095#issuecomment-544393395, but it is stuck because there is one remaining checkbox.

Xaeroxe commented 4 years ago

In that case, I think being pinged about once a year on an issue is pretty tolerable.

@Kimundi @sfackler @withoutboats

https://github.com/rust-lang/rust/issues/44095#issuecomment-544393395 is still awaiting your attention

LukasKalbertodt commented 4 years ago

The libs team has changed quite a bit since the FCP was started. What do you all think about just starting a new FCP in the stabilization PR? Feels like that shouldn't take longer than waiting for the remaining checkboxes here.

Xaeroxe commented 4 years ago

@LukasKalbertodt fine by me, do you mind kicking that off?

m-ou-se commented 3 years ago

Cancelling the FCP here, because that FCP now happened on the stabilization PR: https://github.com/rust-lang/rust/pull/77872#issuecomment-722982535

@fcpbot cancel

m-ou-se commented 3 years ago

uh

@rfcbot cancel