Closed ianlancetaylor closed 1 year ago
These seem like good utilities to have in the standard library, and I think sort
is an intuitive place to put them.
Did you consider making Min
and Max
be variadic instead, accepting an arbitrary number of arguments of the same type?
func Min[T Ordered](vals ...T) T
func Max[T Ordered](vals ...T) T
It would of course be possible to implement the variadic versions in terms of the two-argument versions in my own code if the standard library didn't offer a variadic version, so this is perhaps superflous, but since the variadic version could still be used with two arguments it doesn't seem like it loses any functionality to make it variadic.
(Perhaps there is a performance cost compared to just regular arguments? Is it significant?)
It sounds like Ordered
represents a partial order rather than a total order. Part of me wants that explicit in the name, but at the same time it doesn't seem super likely that there would ever be a "total order" interface alongside this -- at least not in package sort
-- so probably just fine to leave it with a short name and just mention the expectation in the docs (as the example above does, albeit by reference to specific operators rather than using the term "partial order").
When T is a float, we will need to describe behavior in all the special cases (NaN, +/- 0).
When T is a float, we will need to describe behavior in all the special cases (NaN, +/- 0).
Is there any case for the sort
ordering constraint to just not accept floats? The math.Min/Max
functions have defined behavior that I'd assume sort
would want to match anyways.
Is there any case for the sort ordering constraint to just not accept floats?
Floats are certainly tricky. But most users probably don't hit and/or care about those weird cases. So from an ergonomic API view, allowing floats makes sense. It just complicates the implementation a bit, and requires a bit of extra doc for those who care.
I lean towards putting these in a separate cmp
package, for several reasons. For starters, min and max are comparators, not really sorters. As you said, too, sort
is already expansive, while the details of ordering can be neatly separated, implying they should be. And since floats are messy, with no total order, having their ordering delineated in a separate package provides a location on which focus on the difficulty, isolating the problem. It might even be there are variants of Ordered
to deal with this.
Also,cmp.Min
reads better than sort.Min
; at least I think it does.
There is no harm in putting these things in a separate package, and the possibility of appreciable good accruing.
I suppose this question about floats boils down to specifying how Min
and Max
will treat a situation where they are asked to compare a
and b
where a
is neither less than or equal b
nor greater than nor equal b
.
It will presumably then need to make an essentially-arbitrary decision, but that decision ought to be subject to Go's compatibility promises. Is there a typical answer to this that users might expect from other languages, or is this literally a case of "just pick a behavior and document it"?
@apparentlymart In the general case variadic functions will be more expensive, because they require constructing a slice. Of course in general the functions will likely be inlined and the cost will disappear. But, especially since almost all existing examples I found take just two arguments, I think it's preferable to have the two argument version. We can consider adding variadic versions separately. I'm open to counter-argument.
Ordered
is a constraint that describes the set of types that support the <
operator and friends. This is exactly the set of types that the language spec describes as "ordered". So I think that Ordered
is the right name even though arguably for floating-point types we only describe a partial ordering.
@randall77 I think the suggested documentation is clear about the behavior for ±0. I suppose we may have to document what happens for NaN. Sigh. I think my preference would be to say that if either value is a NaN
, we will return either a
or b
but which one we return is unspecified. People who need more clarity can use math.Min
.
@robpike I am fine with using a new cmp package if the preference leans that way. Anybody who prefer cmp to sort, please thumbs-up @robpike 's note. Thanks.
I don't feel strongly about them being variadic; I was considering it only because it seemed like a more general form that could therefore be more broadly useful without losing the convenience of the two-argument case.
However, the fact that the other examples you found are not variadic does indeed seem to suggest that it's not necessary, and combining with the possible performance cost does seem to make the two-argument form seem like the better choice. A variadic version would be pretty simple to implement elsewhere when needed.
@randall77
It just complicates the implementation a bit, and requires a bit of extra doc for those who care.
As far as I know, the only way to implement this is via reflect
(normal type-assertions/type-switches won't cover things like type F float64
), which seems excessive. I'd assume most people would prefer a well-performing if a < b { return a } return b
over something that might use reflect
just to compare two values - even if it isn't correct in all cases.
If we had something like #45380 it would undoubtedly be simple, but that's still on ice.
Personally, I neither like the idea of adding a constraint that excludes floats, nor do I like the idea of adding an imprecise implementation for floats to the stdlib, nor do I like the idea of using reflect
to add a precise version. So, unless we'd use some compiler/runtime shenanigans (which I'd resent a bit as well, because IMO user code should be able to solve this kind of problem as well) I'd be in favor of putting this off until we get something like #45380. For 3rd party code, the decision of using the simple-if-not-strictly-correct code is less problematic than for the standard library.
Extrapolating a bit, if a builtin Ordered constraint also accepted types with some Compare
method (name to be bikeshedded), could that be a path to supporting those same comparison operators on them ? I know this is not the original goal, but it would fit nicely along.
@fgm The standard library is converging on exporting two APIs to support "native" and "custom" operators. e.g. having slices.Equal and slices.EqualFunc. By that approach, we'd have to have Max
and MaxFunc
(etc). Which seems overkill to me.
For what it's worth, in my opinion there just is no unambiguously good way to support both "native" and "custom" operators. There are alternatives to the "two APIs" approach, but they have their own problems.
One question, is it possible make Ordered
an ordinary interface? Ordinary interface implemented by any user defined type would be more useful and general.
And, as rob mentioned, cmp
package is more reasonable.
@merovius I think we can make the implementation fast and general. For example:
func Min[T Ordered](a, b T) T {
if a != a || b != b { ... some special code ... }
... normal min code ...
}
The first line there will completely compile away for non-float types. For floats, the first line will capture any NaN values, which we can then handle at our leisure. It would require the normal min code to handle +/-inf and +/-0 correctly, but it probably would.
I'd rather like to see the cmp
package name for something like https://github.com/golang/go/issues/45200, which is more about equality for testing. I don't know if Ordered/Min/Max would fit into such a package.
So, I have conflicting desires:
slices.Min
?
Note that a new cmp package was previously proposed here: https://github.com/golang/go/issues/58559#issuecomment-1444117979
I lean towards having cmp.TotalOrdered which excludes floats and having a comment telling users to use math.Min/Max/Compare if they need floats. If someone really needs a generic sort that includes fast float missorting they can write it themselves trivially.
@fzipp
Yes, github.com/google/go-cmp/cmp
is a popular package in the Go ecosystem and it is recommended by Google's Go style guide, but if you need both in a test file you can always use an import alias:
import (
"cmp"
testcmp "github.com/google/go-cmp/cmp"
)
Proposal #45200 for "testing/cmp" could be changed to "testing/equal" or "testing/diff" or something else. It would require reflection, so it should be kept separate from the "cmp" package that is discussed here.
variadic Min/Max has some weird corners.
You want to keep the property that Min(x, Min(y, z))
= Min(x, y, Min(z))
= Min(x, y, z, Min[T]())
where T
is the type of x
, y
, z
so Min[T]()
needs to return the greatest element of T
. Likewise, Max[T]()
must return the least element of T
.
That probably seems academic but consider this innocuous line of code y < Min(xs...)
when len(xs)
= 0.
This has two consequences. There needs to be some way for these to get the greatest/least element of an arbitrary T
which currently cannot be done without reflection and there is no greatest string so these can only be used for numbers.
You could have them panic when they receive an empty slice or make the signature something like Min(x T, xs ...T) T
so that it cannot receive 0 values but then you have to do Min(xs[0], xs[1:]...)
neither of which are especially ergonomic.
Binary Min/Max is simpler. Per @neild's suggestion, a slices.Min
/slices.Max
could also be added later once the language has the ability to statically inspect the type argument in some fashion.
@fzipp
I'd rather like to see the
cmp
package name for something like #45200, which is more about equality for testing. I don't know if Ordered/Min/Max would fit into such a package.
There is some implicit conventional usage of cmp
as an identifier for variables and functions in the standard library, for a function of two args returning -1/0/1, or the result of such a function. cmp
is used as a parameter name for this style of function in sort.Find
, slices.BinarySearchFunc
, and slices.CompareFunc
. Another abbrv. would be ord
.
edit: From the example of slices.BinarySearchFunc
, ord.Cmp[E,T] func(E,T) int
seems like a fit as a component of algorithms that really need a cmp
; BinarySearchFunc
seems like evidence that these algos exist and should be written in this style in Go. An ord.Min
or ord.Max
employ the builtin definition of <
... I'm not sure there's much use for constraints like ord.Total
, ord.Partial
, ord.Numeric
but they at least read naturally to me.
I like ord, but it should be ord.Compare (not Cmp) to match strings.Compare and math.Compare.
If Go will supported some ordered types which are hard to be described in type constraints, then it would better to add a builtin ordered
type IMO.
@go101 Yes, if Go ever supports ordered struct or array types, then it will be necessary to do something like define ordered
as a predeclared constraint, similar to comparable
. But there is no particular reason to think that that will happen. If it does happen, we can always write type Ordered ordered
and existing programs will continue to work.
I'm not personally excited about a package name "ord", but I could see order.Ordered
, order.Min
, order.Max
. I do tend to think that "sort" or "cmp" would be preferable, though. Unfortunate that there is already a popular package named "cmp" (of course it's not impossible to have two packages named "cmp", it's just unfortunate).
Sounds good to me, but is there any way to avoid the stutter of order.Ordered
? Maybe order.Interface
, order.Constraint
, or even order.Contract
or order.I
or something?
Sounds good to me, but is there any way to avoid the stutter of order.Ordered?
cmp.Ordered
:)
While I'd prefer to avoid having two common packages with the name cmp
, given that the current one is intended for tests, and the one described here is more likely to be used in non-test code, it would seem to be reasonable to introduce an alias in the rare case they're both used in the same file.
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
Thanks, but I see #58559 as somewhat different. I see that one as mainly about cmp.Lt
. I see this one as mainly about sort.Ordered
and sort.Min
. There is overlap certainly but a different focus.
I suspect that at least cmp.Min
and cmp.Max
, once available, will be rather pervasive in code. We may also want to consider just adding them (min
and max
) to the built-in functions, together with ordered
.
We do have precedent: append
and copy
, which now could be written in generic form (but for their special string argument handling), are present as built-ins because a) we didn't have generics at the time, but also b) because they are so widely used (append
anyway).
A built-in approach would also permit a more flexible implementation of min
and max
(for floats; or possibly accepting more than 2 arguments while permitting very efficient code w/o resorting to a loop in a generic implementation).
I am aware that this may not be a popular opinion, but I thought it should at least be documented.
I guess there are a couple of interlocking decisions to be made:
It seems like that proposal should be put on hold then pending the resolution of this one plus maybe a cycle or two for experience with whatever is chosen. The other ones are too strongly linked to pull apart.
Conditioned on it going into a new package, should that package also have operators like LT, LTE etc.? https://github.com/golang/go/issues/58559
@carlmjohnson You mention this a second time, but a decision on that still not a necessary precondition for this proposal.
Edit: Sorry, I misunderstood. You meant to put that one on hold, not this proposal here.
Ironically now that generics are in, I find myself favoring putting min
and max
into the language as builtins. It makes them easier to use and sidesteps most of the issues these issues (ha) are bringing up.
@griesemer Before I would add predeclared min
/max
functions and an ordered
constraint, I'd rather go back to the drawing board and reconsider the way we express constraints on type parameters to begin with.
When that was discussed at the time, I suggested to split the design up into two stages: First, to only allow constraining on method sets and then, after gathering experience with that, consider adding a way to use operators in a second stage. At the time, the response was (I'm paraphrasing, as I don't think I'll find the actual comment) "if we can't write a generic Min
/Max
function that works with predeclared types with our design, we'd get laughed out of the room". So, to me, adding these specific functions as builtins is essentially an admittance that the generics design failed what was at the time perceived a necessary condition for them.
But also, I don't actually see the benefit, to be honest. It seems to me that the question we are struggling with is one of semantics? A predeclared min
would still have to define how it acts on floats and I don't see (anymore) why a library Min
could not do the same thing. I'm not sure what you mean by a builtin being able to be more efficient than a loop, but a) I don't think that benefit justifies adding predeclared identifiers, personally and b) couldn't that also be addressed with better optimization?
I really don't like the idea of adding builtins for this. I feel that would go back on several design principles Go kept to in the past.
FWIW my (personal) opinion on what we should do is: Define a library Ordered
constraint and include ~float64 | ~float32
with it. Document that it will do the wrong thing in the presence of NaN
and -0
. Let the programmer use *Func
variants if they want to do something else instead.
Yes, it invites people doing the wrong thing accidentally. But I never viewed Go as a language which tried to address every possible way to do a thing badly by adding more typeing.
(As an aside: I find the name "partial order", established as it may be, confusing when applied to floats. To me, a partial order is a reflexive, transitive, anti-symmetric relationship and reserved for things like "ordering vectors element-wise" or "directed graph reachability". i.e. it expresses that some elements might be incomparable, not that it might be unreflexive. OTOH <
on floats seems to be a "strict partial order", which is a confusing term in its own rights. But maybe that's fair enough. Anyways, stepping off my soap box)
If min/max are not built-in, then I don't have any interest or motivation to use them, since it is easier to write min/max myself.
I don't see min/max/ordered as rising to the level of usefulness needed for a built in. They're useful sure, but the main reason to use them is for sorting, and there's no proposal (I hope) to make sort a built in.
I'm a little confused why this proposal got derailed by the Special handling of floats. I would assume that just matching what math.Min/math.Max does now would be the correct way to handle it. If it doesn't match it people will be confused. Yes this means that the implementation will need to specialcase the handling of floats but this does not take anything out from the utility of having these generic functions and in fact is why it is important to have these in the standard library as opposed to a 3rd party one; handling the corner cases in a consistent way.
I don't see min/max/ordered as rising to the level of usefulness needed for a built in
I think at least having a standard definition of "orderable" is valuable, like comparable. It can just be defined as compatible with the built-in operations like <
, which are already well-defined for floats.
But I also think Max and Min are useful in general, and debating the utility of such functions in the stdlib is...I don't know, stingy? It's a common operation, like uppercasing strings. I think not having them would also get us "laughed out of the room."
@willfaught The comment you are quoting is about builtin functions/constraints (i.e. predeclared identifiers). Not about standard library declarations. I think the heuristic is that lower-case is predeclared and upper-case is library.
I know. My point was that it would have the same judgment/reaction.
I think the heuristic is that lower-case is predeclared and upper-case is library.
I'm not sure what you're referring to? I was referring to strings.ToLower/Upper in Go, String.toUpperCase in JS, etc. It's a common stdlib thing to have because it's a common operation.
@willfaught
I know. My point was that it would have the same judgment/reaction.
To clarify: You mean if the canonical way to refer to the minimum/maximum or ordered constraint was via a standard library import instead of a predeclared identifier, people would ridicule Go? I strongly disagree with that.
Otherwise, I don't understand your comments. Because, again, the argument was against making them predeclared identifiers. Not against adding them to the standard library. I don't think doing the latter is very controversial.
Ah, I think I see the problem: I was interpreting "built in" in "I don't see min/max/ordered as rising to the level of usefulness needed for a built in" as meaning the stdlib, not predeclared identifiers. My mistake. I agree with you @Merovius that they shouldn't be predeclared identifiers.
@Merovius You make good points, and I don't really have a strong technical argument against your position. If anything, I just expressed a personal preference.
Personally, I'm not looking forward to seeing a lot of cmp.Min(x, y)
when I could see min(x, y)
which feels just so much more natural. Of course this is also true for, say math.Sin(x)
vs sin(x)
(if it were available), but there's a huge difference in frequency of use and complexity of operation. Putting min
and max
into built-ins feels like a pragmatic choice to me, no matter the fact that we can write them just fine as generic library functions.
I assume this would be a no-go, but a prelude-like stdlib package like in Haskell would be a way to enable having Max and Min available by default without being qualified, while allowing them to be normal declarations too.
As far as I can tell, the original motivation for considering these functions as being predeclared was because that would allow their behavior to be something that wouldn't be possible to implement in today's Go, such as generating specialized implementations for each type rather than relying exclusively on the <
(and similar) operators.
I didn't understand that as being primarily motivated by it being possible to use them without importing a package, even though that is another characteristic of a predeclared identifier. That's an implication of that decision, not a goal, I think?
For the sake of this proposal (which is about new library functionality) I think better to focus on how it might be implemented in the library and whether the new capability is worth its maintenance and teaching costs. If the conclusion is that it isn't possible to achieve the desired functionality in just the library today then a different proposal issue could discuss either a language change to make the library function feasible or, indeed, "just" making these be builtins.
Update, May 17 2023: Current proposal at https://github.com/golang/go/issues/59488#issuecomment-1548505279.
(Edited to change the package from sort to cmp).
I propose adding a new package cmp, which will initially define the following:
The
Min
andMax
functions are trivial but are widely used. The standard library contains 14 versions ofMin
and 7 versions ofMax
(not countingmath.Min
andmath.Max
). With such wide usage it's worth providing versions in the standard library.The
Ordered
constraint is useful forMin
andMax
and also for sorting functions such as those in the x/exp/slices package. This would replaceconstraints.Ordered
in the x/exp/constraints package. GitHub code search reports thatconstraints.Ordered
appears in 2300 files in GitHub, so people do find it useful.If we adopt this proposal, we would not add a constraints package to the standard library.
Ordered
,Min
, andMax
are related to comparing values. Therefore, a package cmp is a reasonable location for them. We could also add them to the existing package sort.Possible other additions to a cmp package would be names for the values returned by
bytes.Compare
and similar functions, such ascmp.Less
(-1
),cmp.Eq
(0
), andcmp.Greater
(1
). We could also add a function wrappers that reverses the sense of a less-than comparator, and one that reverses the sense of abyte.Compare
style comparator.We should define how
Min
andMax
treat NaN values when instantiated with a floating-point type. I propose that if the first argument is a NaN then it will be returned, otherwise if the second argument is a NaN then it will be returned, otherwise compare with<
. When comparing a negative zero and a positive zero, I propose that the first argument will be returned, as with equal values; that is bothMin(0.0, math.Copysign(0.0, -1))
andMin(math.Copysign(0.0, -1), 0.0)
will return0
.Other languages
C++ puts
std::min
andstd::max
in<algorithm>
. Go's standard library doesn't have anything comparable. The C++<algorithm>
header is a bit of a catch-all, including functions that Go has insort
andcontainer/heap
, and functions that Go might put into a possibleiter
package in the future.Java has
Math.min
andMath.max
, which are overloaded functions that takedouble
,float
,int
, orlong
arguments. Go already hasmath.Min
andmath.Max
, but since Go doesn't have overloading they only takefloat64
arguments. Most functions in Go's match package are most naturally defined for floating-point values, so even generic versions of them might not take all ordered values. Andmath.Ordered
doesn't seem right.Python has
min
andmax
builtin functions. There's no particular reasons to make these functions predeclared in Go, since we can write them directly in Go.Rust has
std::cmp::min
andstd::cmp::max
.std::cmp
also has the Rust equivalents ofcomparable
andOrdered
, and some support for comparator functions. This offers some support for using a cmp package in Go. On the other hand Rust's standard library has slice/vector sorting, and has various sorted container types, but as far as I know (which isn't far) doesn't have a general purpose sorting algorithm like the one in Go's sort package.