rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.13k stars 12.69k forks source link

Tracking issue for clamp RFC #44095

Closed Xaeroxe closed 3 years ago

Xaeroxe commented 7 years ago

Tracking issue for https://github.com/rust-lang/rfcs/pull/1961

PR here: #44097 #58710 Stabilization PR: https://github.com/rust-lang/rust/pull/77872

TODO:

pcwalton commented 7 years ago

Please note: This broke Servo and Pathfinder.

aturon commented 7 years ago

cc @rust-lang/libs, this is a case similar to min/max, where the ecosystem was already using the clamp name, and hence adding it has caused ambiguity. This is permitted breakage per semver policy, but it's nevertheless causing downstream pain.

Nominating for the triage meeting on Tues.

Any thoughts in the meantime?

BurntSushi commented 7 years ago

I'm kind of with @bluss on this one in that it would be nice not to repeat it. "Clamp" is probably a great name, but could we sidestep this by choosing a different name?

Xaeroxe commented 7 years ago

restrict clamp_to_range min_max (Because it's kind of like combining min and max.) These might work. Can we use crater to determine how bad the impact of clamp actually is? clamp is well recognized across several languages and libraries.

aturon commented 7 years ago

If we think we might need to rename, it's probably best to revert the PR immediately, and then test more carefully with crater etc. @Xaeroxe, up for that?

Xaeroxe commented 7 years ago

Sure. I've never used crater before, but I can learn.

aturon commented 7 years ago

@Xaeroxe ah sorry, I meant getting a revert PR up quickly. (I'm on vacation today so you may need someone else on libs, like @BurntSushi or @alexcrichton, to help land it).

Xaeroxe commented 7 years ago

I'm preparing the PR now. Have fun on your vacation!

Xaeroxe commented 7 years ago

PR ready https://github.com/rust-lang/rust/pull/44438

egilburg commented 7 years ago

Could clamp_to_range(min, max) be composed of clamp_to_min(min) and clamp_to_max(max) (with the additional assertion that min <= max), but those functions could also be called independently?

Xaeroxe commented 7 years ago

I suppose that idea mandates an RFC.

Xaeroxe commented 7 years ago

I gotta say though I've been working on getting a 4 line function into the std library for 6 months now. I'm kind of worn out. The same function got merged into num in 2 days and that's good enough for me. If anyone else really wants this in the std library go ahead, but I'm just not ready for another 6 months of this.

Xaeroxe commented 7 years ago

I'm reopening this so that @aturon 's previous nomination will still be seen.

scottmcm commented 7 years ago

I think that either this should go in as-written or the guidance on what changes can be made should be updated to avoid wasting peoples' time in future.

It was very clear from early that this could cause the breakage it did. Personally, I compared it to ord_max_min which broke a bunch of things:

And the response to that was "The function Ord::min was added [...] The libs team decided today that this is accepted breakage". And that was a TMTOWTDI feature with a more-common name, whereas clamp didn't already exist in std under a different form.

It feels, subjectively, to me that if this RFC is reverted, the actual rule is "You basically can't put new methods on traits in std, except maybe Iterator".

CryZe commented 7 years ago

You also can't really put new methods on actual types either. Consider the situation where someone had an "extension trait" for a type in std. Now std implements a method the extension trait provided as an actual method on this type. Then this reaches stable, but this new method is still behind a feature flag. The compiler will then complain that the method is behind a feature flag and can't be used with the stable toolchain, instead of the compiler choosing the extension trait's method like before and thus causing breakage on the stable compiler.

Xaeroxe commented 7 years ago

It's also worth noting: This isn't just a standard library problem. Method call syntax makes it really difficult to avoid introducing breaking changes just about anywhere in the ecosystem.

kennytm commented 7 years ago

(meta) Just copying my comment in irlo here.

If we agree that #44438 is justified,

  1. We may need to reconsider whether guaranteed-type-inference-breakage like can really be disregarded as XIB.

    Currently type inference change is considered acceptable by RFCs 1105 and 1122 as one could always use UFCS or other ways to force a type. But the community doesn't really like the breakage caused by #42496 (Ord::{min, max}). Additionally, #41336 (first try of T += &T) was closed "just" due to 8 type inference regressions.

  2. Whenever we add a method, there should be a crater run to ensure the name is not already existing.

    Note that adding inherent methods can cause inference failure as well — #41793 was caused by adding the inherent methods {f32, f64}::from_bits, which conflicts with the method ieee754::Ieee754::from_bits in the downstream trait.

  3. When downstream crate did not specify #![feature(clamp)], the candidate Ord::clamp should never be considered (a future-compatible warning can still be issued) unless this is the unique solution. This will allow introduction of new trait methods not "insta-breaking", but the problem will still come back when stabilizing.

sfackler commented 7 years ago

It seems like we're in a pretty bad place if any method that people wanted enough to define an extension trait can never be added to the standard library.

bluss commented 7 years ago

Max/min hit a particularly bad spot with regards to using common method names on a common trait. The same doesn't need to apply to clamp.

I still want to say yes, but @sfackler do we really have to add methods on a trait that is so commonly implemented, by diverse types? We have to be careful when we are adding to the api of all types that have bought in to an existing trait.

With specialisation coming we don't lose anything by putting extension methods in an extension trait.

One annoying part is that if the new std method breaks your code: it will appear long before you can actually use it, since it's unstable. Other than that it's not so bad if the conflict is with a method that has the same meaning.

Others commented 7 years ago

I think giving this function a different name to avoid breakage is a bad solution. While it works, it's optimizing not breaking a few crates (all of which are opting into nightly) instead of optimizing for future readability of any code using this feature.

bluss commented 7 years ago

I have a few concerns of which a few are no worry imo.

scottmcm commented 7 years ago

compound types

I think it makes just as much sense as BtreeSet<BtreeSet<impl Ord>>::range. But there are particular cases that could even be helpful, like Vec<char>.

calling mode by value

When this came up in the RFC, the answer was just use Cow.

Of course, it could be something like this, to reuse storage:

    fn clamp<T>(mut self, low: &T, high: &T) -> Self
        where T: ?Sized + ToOwned<Owned=Self> + Ord, Self : Borrow<T>
    {
        assert!(low <= high);
        if self.borrow() < &low {
            low.clone_into(&mut self);
        } else if self.borrow() >= &high {
            high.clone_into(&mut self);
        }
        self
    }

Which https://github.com/rust-lang/rfcs/pull/2111 might make ergonomic to call.

alexcrichton commented 7 years ago

The libs team discussed this during triage a few days ago and the conclusion was that we should do a crater run to see what the breakge is across the ecosystem for this change. The results of that would determine what action should be taken on precisely this issue.

There's a number of possible future language features we could add to ease adding apis like this such as low-priority traits or using extension traits in a more flavorful manner. We don't want to necessarily block this on advancements like those, however.

varkor commented 6 years ago

Did a crater run ever happen for this feature?

kennytm commented 6 years ago

I plan to revive the clamp() method after #48552 is merged. However, RangeInclusive is going to be stabilized before that, meaning the range-based alternative is now viable for consideration (which is actually the original proposal, but retracted because ..= was so unstable 😄):

// Current
trait Ord {
    fn clamp(self, min: Self, max: Self) -> Self { ... }
}
assert_eq!(9.clamp(6, 7), 7);

// Alternative
trait Ord {
    fn clamp(self, range: RangeInclusive<Self>) -> Self { ... }
}
assert_eq!(9.clamp(6..=7), 7);
scottmcm commented 6 years ago

A stable RangeInclusive also opens up other possibilities, like flipping things around (which enables some interesting possibilities with autoref, and avoids the name collisions altogether):

impl<T: Ord + Clone> RangeInclusive<T> {
    fn clamp(&self, mut x: T) -> T {
        if x < self.start { x.clone_from(&self.start); }
        else if x > self.end { x.clone_from(&self.end); }
        x
    } 
} 

    assert_eq!((1..=10).clamp(11), 10);

    let strings = String::from("aa")..=String::from("b");
    assert_eq!(strings.clamp(String::from("a")), "aa");
    assert_eq!(strings.clamp(String::from("aaa")), "aaa");

https://play.rust-lang.org/?gist=38def79ba2f3f8380197918377dc66f5&version=nightly

I haven't decided whether I think that's better, though...

lu-zero commented 6 years ago

I would use a different name if used as range method.

Surely I would enjoy having the feature sooner than later, no matter the shape.

EdorianDark commented 6 years ago

What ist the current status? It seems to me that there is consensus, that adding clamp to RangeInclusive might be an better alternative. So someone has to write an RFC?

kennytm commented 6 years ago

A full RFC is probably not needed at this point. Just a decision which spelling to choose:

  1. value.clamp(min, max) (follow the RFC as-is)
  2. value.clamp(min..=max)
  3. (min..=max).clamp(value)
egilburg commented 6 years ago

Option 2 or 3 would allow easier partial clamping. You could do value.clamp(min..) or value.clamp(..=max), without need for special clamp_to_start or clamp_to_end methods.

varkor commented 6 years ago

@egilburg: we already have those special methods: clamp_to_start is max and clamp_to_end is min :wink:

The consistency is nice though.

kennytm commented 6 years ago

@egilburg Rust doesn't support direct overloading. For option 2 to work with your suggestion we'll need a new trait implemented for RangeInclusive, RangeToInclusive and RangeFrom, which feel quite heavy weight.

EdorianDark commented 6 years ago

I think, that option 3 is the best option.

lu-zero commented 6 years ago

1 or 2 are the least surprising. I'd stay with 1 since lots of code would have less to do to replace the local implementation with the std one.

scottmcm commented 6 years ago

I think we should either plan to use all the range* types or none of them.

Of course, that's harder for things like Range than for RangeInclusive. But there's something nice about (0.0..1.0).clamp(2.0_f32) => 0.99999994_f32.

EdorianDark commented 6 years ago

@kennytm So if I would open a pull request with option 3 do you think it would get merged? Or what do you think about how to proceed next?

kennytm commented 6 years ago

@EdorianDark For this we'll need to ask @rust-lang/libs 😃

SimonSapin commented 6 years ago

I personally like option 2, with RangeInclusive only. As mentioned "partial clamping" already exist with min and max.

jminer commented 6 years ago

I agree with @SimonSapin, although I would also be OK with option 1. With option 3, I likely wouldn't use the function because it seems backwards to me. In the other languages/libraries with clamp that @kennytm surveyed earlier, 5 out of 7 (all but Swift and Qt) have the value first, then the range.

EdorianDark commented 5 years ago

Clamp is now in master again!

Xaeroxe commented 5 years ago

I'm pleased, though I'm still trying to figure out what changed that made this acceptable now, whereas it wasn't in #44097

kennytm commented 5 years ago

We've now got a warning period due to #48552, instead of instantly breaking inference even before stabilizing.

Xaeroxe commented 5 years ago

That's great news, thank you!

Xaeroxe commented 5 years ago

@kennytm I just want to thank you for the legwork you did on making #48552 happen, and @EdorianDark thanks for your interest in this and getting it implemented. It's wonderful to see this finally merged.

kornelski commented 5 years ago

https://rust.godbolt.org/z/JmLWJi

pub fn clamped(a: f32) -> f32 {
   a.clamp(0.,255.)
}

Compiles to:

  vxorps xmm1, xmm1, xmm1
  vmaxss xmm0, xmm1, xmm0
  vmovss xmm1, dword ptr [rip + .LCPI0_0]
  vminss xmm0, xmm1, xmm0

which isn't too bad (vmaxss and vminss are used), but:

pub fn maxmined(a: f32) -> f32 {
   (0f32).max(a).min(255.)
}

uses one instruction less:

  vxorps xmm1, xmm1, xmm1
  vmaxss xmm0, xmm0, xmm1
  vminss xmm0, xmm0, dword ptr [rip + .LCPI1_0]

Is that inherent to the clamp implementation, or just a quirk of LLVM optimization?

scottmcm commented 5 years ago

@kornelski clamping a NAN is supposed to preserve that NAN, which that maxmined doesn't, because max/min preserve the non-NAN.

It'd be great to find an implementation that both meets the NAN expectations and is shorter. And it would be good for the doctests to showcase NAN handling. Looks like the original PR had some:

https://github.com/rust-lang/rust/blob/b762283e57ff71f6763effb9cfc7fc0c7967b6b0/src/libstd/f32.rs#L1089-L1094

Xaeroxe commented 5 years ago

@scottmcm https://github.com/rust-lang/rust/pull/59327 done

tspiteri commented 5 years ago

Why does clamping floats panic if min or max is NaN? I would change the assertion from assert!(min <= max) to assert!(!(min > max)), so that a NaN minimum or maximum would have no effect, just like in the max and min methods.

Xaeroxe commented 5 years ago

NAN for min or max in clamp is likely indicative of a programming error, and we figured it was better to panic sooner rather than possibly feeding unclamped data out to IO. If you don't want an upper or lower bound this function isn't for you.

jdahlstrom commented 5 years ago

You could always use INF and -INF if you don't want an upper or lower bound, right? Which also makes mathematical sense, unlike NaN. But most of the time it's better to use max and min for that.