Closed Xaeroxe closed 3 years ago
@Xaeroxe Thank you for the implementation.
Maybe this could go in the next edition, if it will break stable code?
One thing that IMO is worth considering in more detail is one-sided clamping of f32
/f64
. The discussion seems to have touched on this topic briefly but not really considered it in detail.
In most cases, if the input to a one-sided clamp is NAN, it's more useful for the result to be NAN than for the result to be the clamping boundary. So, the existing f32::min
and f64::max
functions don't work well for this use-case. We need separate function(s) for one-sided clamping. (See rust-num/num-traits#122.)
The reason why I bring this up is that it affects the design of the two-sided clamp
, since it would be nice for two-sided and one-sided clamps to have a consistent interface. A few options are:
input.clamp(min, max)
, input.clamp_min(min)
, and input.clamp_max(max)
input.clamp(min..=max)
, input.clamp(min..)
, input.clamp(..=max)
input.clamp(min, max)
, input.clamp(min, std::f64::INFINITY)
, input.clamp(std::f64::NEG_INFINITY, max)
With the current implementation (min
and max
as separate f32
/f64
parameters), we would have to choose option 1, which I think is perfectly reasonable, or option 3, which IMO is too verbose. We just should be aware that the sacrifice is having to add separate clamp_min
and clamp_max
functions or requiring the user to write out positive/negative infinity.
It's also worth noting that we could provide
impl f32 {
pub fn clamp<T>(self, bounds: T) -> f32
where
T: RangeBounds<f32>,
{
// ...
}
}
// and for f64
since for f32
/f64
we actually know how to handle exclusive bounds, unlike for general Ord
. Of course, then we'd probably want to change Ord::clamp
to take a RangeInclusive
argument for consistency. It seems like there was not a strong opinion either way on whether to prefer two arguments or a single RangeInclusive
argument for Ord::clamp
.
If this is already a settled issue, please feel free to dismiss my comment. I just wanted to bring these things up because I didn't see them in earlier discussion.
Triage: the APIs below currently are unstable and point here. Are there issues to consider other than NaN handling? Is it worth stabilizing Ord::clamp
first without blocking it on NaN handling?
pub trait Ord: Eq + PartialOrd<Self> {
// …
fn clamp(self, min: Self, max: Self) -> Self where Self: Sized {…}
}
impl f32 {
pub fn clamp(self, min: f32, max: f32) -> f32 {…}
}
impl f64 {
pub fn clamp(self, min: f64, max: f64) -> f64 {…}
}
@SimonSapin I'd be happy to stabilize the whole thing personally
+1, this went through a full RFC and I don't think there's anything material that's come up since then. For example, NaN handling came up in detail on IRLO and in the RFC discussion.
Alright, that sounds fair enough.
@rfcbot fcp merge
Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
Was there a decision made about x.clamp(7..=13)
vs x.clamp(7, 13)
? https://github.com/rust-lang/rust/issues/44095#issuecomment-533764997 mentions the first one might be better for consistency with a potential future f64::clamp
.
@m-ou-se See the discussion starting from https://github.com/rust-lang/rust/issues/44095#issuecomment-411457313 and also https://github.com/rust-lang/rust/pull/58710#pullrequestreview-207529970.
I'd say that's quite an unfortunate resolution as .min
and .max
frequently cause bugs as you use .min(...)
to specify the upper bound and .max(...)
to specify the lower bound. This is incredibly confusing and I've seen so many bugs with this. .clamp(..1.0)
and .clamp(0.0..)
are just so much clearer.
@CryZe makes a very good point: even if you never make a mistake with min = upper bound, max = lower bound, you still have to do mental gymnastics to remember which to use. This cognitive load would be better spent on whatever problem you are trying to solve.
I know x.clamp(y, z)
is more expected, but maybe this is an opportunity to innovate ;)
I experimented quite a bit with ranges in the early stages, and even delayed the RFC by several months so we could experiment with inclusive ranges. (This was started before they were stabilized)
I discovered that it was not possible to implement clamp for exclusive ranges on floating point numbers. Only supporting some range types but not others was too surprising of an outcome, so even though I'd delayed the RFC several months to experiment with it this way I ultimately decided ranges were not the solution.
@m-ou-se See the discussion starting from #44095 (comment) and also #58710 (review).
Edit: as pointed out below, the discussion in the pull request (#58710) contains more discussion of the design decision than the tracking issue. It's unfortunate that this wasn't communicated here, which is where design discussions typically take place, but it was discussed.
Only supporting some range types but not others was too surprising of an outcome
Rust already treats some ranges differently than others (e.g. using them for iteration), so only permitting some ranges as clamp
arguments doesn't seem surprising to me at all.
Here's the most helpful analysis of it: https://github.com/rust-lang/rfcs/pull/1961#issuecomment-302600351
@Xaeroxe Only supporting some range types but not others was too surprising of an outcome
If you were thinking about this before they were stabilized, has time and general usage changed your opinion, or do you think it is still the case?
I would argue that exclusive ranges should never be implemented for floats anyway, because they would have different behaviour to integers (the range 0..10
includes the lower bound and excludes the upper bound, so why should the hypothetical range 0.0...10.0
exclude both?). I don't think it would be surprising, at least for me.
@varkor But this was then changed after a single comment in the review, without any discussion on the tracking issue.
This might come across as overly confrontational, try something like "when I looked through the conversation I didn't find a convincing argument as to why we shouldn't use ranges, can someone point me to it?".
I suspect the argument you are looking for is here: https://github.com/rust-lang/rfcs/pull/1961#issuecomment-302600351
EDIT @Xaeroxe beat me to it :)
has time and general usage changed your opinion
Thus far it has not, but ranges are something I use pretty infrequently in my daily coding. I'm open to being persuaded by code examples and existing APIs with partial range support. However, even if we resolve that question there are still several other excellent points scottmcm brings up in the RFC comment that need to be addressed. For example, Step
is not implemented on as many types as Ord
, is this minor syntactical change worth losing those types? Furthermore, is there a use case for non inclusive range clamps? As far as I can tell, no other language or framework felt the need to support exclusive range clamp, so what benefit do we gain from it? Ranges were much more difficult to implement in a satisfactory way, and came with many downsides and few benefits.
If I were to implement this using ranges, it'd look like this.
So there's a few reasons I think we shouldn't go with this approach.
The selection of ranges needed is novel enough as to require a new trait, and specifically excludes the most common range, Range
.
We're already this far along in the RFC process and the only thing std gains from this is yet another way to write .max()
or .min()
. I don't really want to set back the RFC to the beginning of the process in order to implement something we can already do in Rust.
It doubles the amount of branching happening in the function in order to accommodate a use case we're still not sure exists. I can't get this to show up in benchmarks.
... the only thing std gains from this is yet another way to write
.max()
or.min()
.
The primary point I was trying to make is that I've seen this apparent equivalence between .min()
/.max()
and one-sided clamps come up multiple times in the discussion, but the operations are not equivalent for floating point numbers in the way they should handle NAN
.
For example, consider input.max(0.)
as an expression to clamp negative numbers to zero. If input
is non-NAN
, it works fine. However, when input
is NAN
, it evaluates to 0.
. This is almost never the desired behavior; one-sided clamping should preserve NAN
values. (See this comment and this comment.) In conclusion, .max()
works well for taking the larger of two numbers, but it does not work well for one-sided clamping.
So, we need one-sided clamping operations (separate from .min()
/.max()
) for floating-point numbers. Others made good arguments for the utility of one-sided clamping operations for non-floating-point types too. The next question is how we want to express those operations.
.clamp()
with INFINITY
In other words, don't add a one-sided clamping operation; just tell users to use .clamp()
with INFINITY
or NEG_INFINITY
bounds. For example, tell users to write input.clamp(0., std::f64::INFINITY)
.
This is very verbose, which will push users to use the incorrect .min()
/.max()
if they aren't aware of the nuances of NAN
handling. Additionally, it doesn't help for T: Ord
, and IMO it's less clear than the alternatives.
.clamp_min()
and .clamp_max()
One reasonable option is to add .clamp_min()
and .clamp_max()
methods, which wouldn't require any changes to the currently proposed implementation. I think this is a reasonable approach; I just wanted to make sure that we were aware that we'd have to use this approach if we stabilize the currently proposed implementation of clamp
.
Another option is make clamp
take a range argument. @Xaeroxe has shown one way to implement this, but that implementation does have a few disadvantages, as he mentioned. An alternative way to write the implementation is similar to the way slicing is currently implemented (the SliceIndex
trait). This resolves all the objections I've seen in the discussion except for the concern about providing implementations for a subset of range types and the additional complexity. I agree that it adds some complexity, but IMO it's not much worse than adding .clamp_min()
/.clamp_max()
. For Ord
, I'd suggest something like this:
pub trait Ord: Eq + PartialOrd<Self> {
// ...
fn clamp<B>(self, bounds: B) -> B::Output
where
B: Clamp<Self>,
{
bounds.clamp(self)
}
}
pub trait Clamp<T> {
type Output;
fn clamp(self, input: T) -> Self::Output;
}
impl<T> Clamp<T> for RangeFull {
type Output = T;
fn clamp(self, input: T) -> T {
input
}
}
impl<T: Ord> Clamp<T> for RangeFrom<T> {
type Output = T;
fn clamp(self, input: T) -> T {
if input < self.start {
self.start
} else {
input
}
}
}
impl<T: Ord> Clamp<T> for RangeToInclusive<T> {
type Output = T;
fn clamp(self, input: T) -> T {
if input > self.end {
self.end
} else {
input
}
}
}
impl<T: Ord> Clamp<T> for RangeInclusive<T> {
type Output = T;
fn clamp(self, input: T) -> T {
assert!(self.start <= self.end);
let mut x = input;
if x < self.start { x = self.start; }
if x > self.end { x = self.end; }
x
}
}
Some thoughts on this:
We could add implementations for exclusive ranges where T: Ord + Step
.
We could keep the Clamp
trait nightly-only, similar to the SliceIndex
trait.
To support f32
/f64
, we could
Relax the implementations to T: PartialOrd
. (I'm not sure why the current implementation of clamp
is on Ord
instead of PartialOrd
. Maybe I missed something in the discussion? It seems like PartialOrd
would be sufficient.)
or write implementations specifically for f32
and f64
. (If desired, we could always switch to option i later without a breaking change.)
and then add
impl f32 {
// ...
fn clamp<B>(self, bounds: B) -> B::Output
where
B: Clamp<Self>,
{
bounds.clamp(self)
}
}
impl f64 {
// ...
fn clamp<B>(self, bounds: B) -> B::Output
where
B: Clamp<Self>,
{
bounds.clamp(self)
}
}
We could implement Clamp
for exclusive ranges with f32
/f64
later if desired. (@scottmcm commented that this isn't straightforward because std
intentionally doesn't have f32
/f64
predecessor operations. I'm not sure why std
doesn't have those operations; maybe issues with denormal numbers? Regardless, that could be addressed later.)
Even if we don't add implementations of Clamp
for exclusive ranges with f32
/f64
, I disagree that this would be too surprising. As @varkor points out, Rust already treats various range types differently for the purpose of Copy
and Iterator
/IntoIterator
. (IMO, this is a wart of std
, but it's at least one instance where range types are treated differently.) Plus, if someone did try to use an exclusive range, the error message would be easy to understand ("the trait bound std::ops::Range<f32>: Clamp<f32>
is not satisfied").
I've made Output
an associated type for maximum flexibility for adding more implementations in the future, but that's not strictly necessary.
Basically, this approach allows us as much flexibility as we want with regards to trait bounds. It also makes it possible to start with a minimally useful set of Clamp
implementations, and add more implementations later without breaking changes.
The "use .clamp()
with INFINITY
" approach has substantial disadvantages, as mentioned above.
The "existing .clamp
" + .clamp_min()
+ .clamp_max()
approach has the following disadvantages:
input.clamp_min(0)
instead of input.clamp(0..)
..clamp()
in the future (without adding yet more methods). For example, we can't support clamping a u32
value with u8
bounds, which is a requested feature from the RFC discussion. That particular example may be better handled with a .saturating_into()
function, but there may be other examples where more clamping implementations would be useful..min()
, .max()
, .clamp_min()
, and .clamp_max()
for one-sided clamping. (Clamping with .clamp_min()
is similar to using .max()
, and clamping with .clamp_max()
is similar to using .min()
.) We could mostly avoid this issue by naming the one-sided clamping operations .clamp_lower()
/.clamp_upper()
or .clamp_to_start()
/.clamp_to_end()
instead of .clamp_min()
/.clamp_max()
, although that's even more verbose (input.clamp_lower(0)
versus input.clamp(0..)
).The range argument approach has the following disadvantages:
.clamp_min()
/.clamp_max()
.Clamp
for the exclusive range types, this may be surprising.I don't have a strong opinion on the "existing .clamp
" + .clamp_min()
+ .clamp_max()
approach versus the range argument approach. It's a trade-off.
@Xaeroxe
It doubles the amount of branching happening in the function in order to accommodate a use case we're still not sure exists.I can't get this to show up in benchmarks.
Maybe the extra branch will be optimized away by LLVM?
On one-sided clamping
Because clamping is inclusive on both sides, one can just specify the min/max on the left/right to get the one-side-only clamping behaviour. I think that's perfectly acceptable, and arguably nicer than .clamp((Bound::Unbounded, Inclusive(3.2)))
where there isn't a Range*
type that fits anyway:
x.clamp(i32::MIN, 10);
x.clamp(-f32::INFINITY, 10.0);
There's no perf loss, as LLVM is trivially able to optimize the dead side away: https://rust.godbolt.org/z/l_uBLO
Range syntax would be cool, but clamp
is basic enough that two separate args are fine and easy to understand.
Maybe min
/max
NaN
handling can be fixed by itself, e.g. by changing implementation of f32
's inherent methods? Or specializing the PartialOrd::min/max
? (with an edition flag, assuming Rust manages to find a way to toggle things in libstd).
@scottmcm you should check out RangeToInclusive.
After pondering this some more it's occurred to me that stable is forever, so we shouldn't consider "resetting the RFC process" to be a reason not to make a change.
To that end, I want to go back to the mindset I had when implementing this. Clamp conceptually operates over a range, so it makes sense to use the vocabulary Rust already has in place for expressing ranges. That was my knee jerk reaction, and it seems to be the reaction for many other people. So let's re-iterate the arguments for not doing it this way, and see if we can refute them.
The selection of ranges needed is novel enough as to require a new trait, and specifically excludes the most common range, Range
.
Range*
types, such Ord + Step
in order to correctly return values for exclusive ranges. So, even though exclusive range clamp is oftentimes not really needed, we can actually accept the whole range of ranges here, without compromising the interface of ranges that don't have these technical limitations.We can just use Infinity/Min/Max for one sided clamping.
Range*
syntax involves fewer characters and fewer imports for this use case.Now that we've refuted the reasons not to do this, this comment lacks any kind of motivation to make the change, as the options appear equivalent. Let's find some motivation to make the change. I've only got one reason, which is that the general opinion in this thread seems to be that the range based approach improves semantics of the language. Not just for the inclusive doubled ended range clamp, but also for functions like .min()
and .max()
.
I'm curious if this line of thought has any traction with others who are in favor of stabilizing the RFC as is.
I think it would be better to leave Clamp in the current form, because it is now very similar to other languages. When I worked on my pull request #58710 I tried to use a Range based implementation. But rust-lang/rfcs#1961 (comment)) convinced me, that it is better in the standard form.
I think that it would be logical to have a #[must_use]
attribute on the function, so as to not confuse people who are not used to how rust numerics work. That is to say, I could easily perceive someone writing the following (incorrect) code:
let mut x: f64 = some_number_source();
x.clamp(0.0, 1.0);
//Proceeds to assume that 0.0 <= x <= 1.0
In general, rust takes a (number).method()
approach to numerics (whereas other languages use Math.Method(number)
), but even when keeping this in mind, it would be a logical assumption that this could modify number
. This is more of a quality of life thing than anything.
The [must_use]
attribute was added recently.
@ Xaeroxe Did you come up with something for range based clamp?
I think that the function as it is right now would best fit to the other numeric functions of rust and would like to start stabilizing it again.
At this moment I don't see any reason to go with a range based clamp. Yes, let's add the must_use attribute and work towards stabilization.
@SimonSapin @scottmcm Could we restart the stabilization process?
As @jturner314 said, it would be great to have clamp on PartialOrd, instead of Ord, so it can also be used on floats.
We have the specialized f32::clamp
and f64::clamp
in this issue already.
This is what I'm trying to do:
use num_traits::float::FloatCore;
struct Foo<T> (T);
impl<T: FloatCore> Foo<T> {
fn foo(&self) -> T {
self.0.clamp(1, 10)
}
}
fn main() {
let foo = Foo(15.3);
println!("{}", foo.foo())
}
PartialOrd
is not a float-only trait. Having float-specific method doesn't make clamp available for custom PartialOrd
types.
Current implementation requires Eq
, even though it doesn't use it.
The major concern with PartialOrd
was that it provides weaker guarantees, which in turn weakens the guarantees of clamp. Users wanting this to be on PartialOrd
may be interested in another function I wrote https://docs.rs/num/0.2.1/num/fn.clamp.html
What are these guarantees?
A fairly natural expectation is that iff x.clamp(a, b) == x
then a <= x && x <= b
. This is not guaranteed with PartialCmp
where x
may be incomparable with either a
or b
.
Came here today looking for vaguely-remembered clamp()
and read the discussion with interest.
I would suggest using the "option trick" as a compromise between allowing arbitrary ranges and having several named functions. I know this is not popular with some, but it seems to capture the desired semantics nicely here:
#![allow(unstable_name_collisions)]
pub trait Clamp: Sized {
fn clamp<L, U>(self, lower: L, upper: U) -> Self
where
L: Into<Option<Self>>,
U: Into<Option<Self>>;
}
impl Clamp for f32 {
fn clamp<L, U>(self, lower: L, upper: U) -> Self
where
L: Into<Option<Self>>,
U: Into<Option<Self>>,
{
let below = match lower.into() {
None => self,
Some(lower) => self.max(lower),
};
match upper.into() {
None => below,
Some(upper) => below.min(upper),
}
}
}
#[test]
fn test_clamp() {
assert_eq!(1.0, f32::clamp(2.0, -1.0, 1.0));
assert_eq!(-1.0, f32::clamp(-2.0, -1.0, 1.0));
assert_eq!(1.0, f32::clamp(2.0, None, 1.0));
assert_eq!(-1.0, f32::clamp(-2.0, -1.0, None));
assert_eq!(2.0, f32::clamp(2.0, -1.0, None));
assert_eq!(-2.0, f32::clamp(-2.0, None, 1.0));
}
If this were included in std
a blanket implementation could also be included for T: Ord
, which would cover the concerns raised about a general PartialOrd
implementation.
Given that defining a clamp()
function in user code currently generates a compiler warning about unstable name collisions by default, I think the name "clamp" is fine for this function.
I think, that clamp(a,b,c)
should behave the same was as min(max(a,b), c)
.
Since max
and min
are not implemented for PartialOrd
neither should clamp
.
The issue with NaN
was already discussed.
@EdorianDark i agree. min, max should also only require PartialOrd.
@noonien min
and max
are defined since Rust 1.0 and they require Ord
and have a definition for f32
and f64
.
This is not the right place to discuss these function.
Here we can can only take care that min
, max
and clamp
behave comparable and not surprisingly.
Edit: I don't like the situation with PartialOrd
and would rather have float
implement Ord
, but this is is not possible to change anymore after 1.0.
This has been merged and unstable for about a year and a half now. How do we feel about stabilizing this?
I would love stabilizing this!
If clamp
method name conflict sounds like an issue, I suggested changing name resolution at one point in https://github.com/rust-lang/rust/pull/66852#issuecomment-561667812, and it would help with this too.
@Xaeroxe I think the process is to submit stabilization PR and ask for libs team consensus on that. It seems that t-libs is overloaded and can't keep up with non-fcped things.
Okay, did that https://github.com/rust-lang/rust/pull/77872
@matklad actually an FCP proposal already started last year at https://github.com/rust-lang/rust/issues/44095#issuecomment-544393395, but it is stuck because there is one remaining checkbox.
In that case, I think being pinged about once a year on an issue is pretty tolerable.
@Kimundi @sfackler @withoutboats
https://github.com/rust-lang/rust/issues/44095#issuecomment-544393395 is still awaiting your attention
The libs team has changed quite a bit since the FCP was started. What do you all think about just starting a new FCP in the stabilization PR? Feels like that shouldn't take longer than waiting for the remaining checkboxes here.
@LukasKalbertodt fine by me, do you mind kicking that off?
Cancelling the FCP here, because that FCP now happened on the stabilization PR: https://github.com/rust-lang/rust/pull/77872#issuecomment-722982535
@fcpbot cancel
uh
@rfcbot cancel
Tracking issue for https://github.com/rust-lang/rfcs/pull/1961
PR here:
#44097#58710 Stabilization PR: https://github.com/rust-lang/rust/pull/77872TODO: