golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.07k stars 17.68k forks source link

constraints: new package to define standard type parameter constraints #45458

Closed ianlancetaylor closed 3 years ago

ianlancetaylor commented 3 years ago

Note: Discussion is now at #47319

This proposal is for use with #43651. The proposal document for that issue mentions adding a constraints package to define some standard useful constraints. Here we propose a specific API to define. If this proposal is accepted, the new package will be included with the first release of Go that implements #43651 (we currently expect that that will be Go 1.18).

This description below is focused on the API, not the implementation. In general the implementation will be straightforward.

The intent is that if we add any new predeclared types, such as int128 and uint128, those types will be supported by these constraints as appropriate based on the constraint descriptions. To make this work, code should follow these guidelines: 1) given a type parameter whose constraint is taken from the constraints package, don't use that type parameter to instantiate a type/function whose constraint is not any and is not taken from the constraints package; 2) don't write a type switch to handle all types when using a constraint taken from the constraints package. We can add vet checks to detect cases where these guidelines are not followed.

// Package constraints defines a set of useful constraints to be used with type parameters.
package constraints

// Signed is a constraint that permits any signed integer type.
type Signed interface { ... }

// Unsigned is a constraint that permits any unsigned integer type.
type Unsigned interface { ... }

// Integer is a constraint that permits any integer type.
type Integer interface { ... }

// Float is a constraint that permits any floating-point type.
type Float interface { ... }

// Complex is a constraint that permits any complex numeric type.
type Complex interface { ... }

// Ordered is a constraint that permits any ordered type: any type that supports the operators < <= >= >.
type Ordered interface { ... }

// Slice is a constraint that matches slices of any element type.
type Slice[Elem any] interface { ~[]Elem }

// Map is a constraint that matches maps of any element and value type.
type Map[Key comparable, Val any] interface { ~map[Key]Val }

// Chan is a constraint that matches channels of any element type.
type Chan[Elem any] interface { ~chan Elem }
gudvinr commented 3 years ago

I'd like to just quote this blogpost:

Avoid meaningless package names. Packages named util, common, or misc provide clients with no sense of what the package contains. This makes it harder for clients to use the package and makes it harder for maintainers to keep the package focused. Over time, they accumulate dependencies that can make compilation significantly and unnecessarily slower, especially in large programs. And since such package names are generic, they are more likely to collide with other packages imported by client code, forcing clients to invent names to distinguish them.
... Don't use a single package for all your APIs. ...

ianlancetaylor commented 3 years ago

@gudvinr I don't think this proposal fails on either of those points. This package is named "constraints", and it contains type constraints. Nothing else. It doesn't contain a bunch of different APIs.

But I would be happy to hear an alternative suggestion. Thanks.

smasher164 commented 3 years ago

Given that these constraints seem to be centered around describing builtin types, is there a reason something like comparable can't be declared here as well? I'm assuming the reason is something like "comparable requires a language spec change, while these constraints are just definitions." However the unsafe package is also described in the spec and is its own separate package.

zephyrtronium commented 3 years ago

What factors decide which constraints are defined in this package? E.g., why are there all of Signed, Unsigned, and Integer, but not, say, "types convertible to string" (string, []byte, []rune, rune) or "types appendable to []byte" (string, []byte)?

ianlancetaylor commented 3 years ago

@smasher164 Yes, as you say, comparable requires a language spec change. This proposed package isn't comparable to the unsafe package. The unsafe package exists only in the language spec. The source code found in the standard library is only documentation. The proposed constraints package would be an ordinary package like any other in the standard library.

@zephyrtronium I picked the constraints that seemed most likely to be useful. If you want to make the case for additional constraints to be defined by the package, this is the place to do it.

zephyrtronium commented 3 years ago

@ianlancetaylor

@zephyrtronium I picked the constraints that seemed most likely to be useful. If you want to make the case for additional constraints to be defined by the package, this is the place to do it.

I do think that the constraints I mentioned would be useful, but they might be better suited to living in packages strings and bytes. Similarly, I think the proposed Float, Complex, and Ordered might be more intuitive in math, math/cmplx, and sort, respectively, for reasons along the lines of @gudvinr's comment.

This would leave the integer constraints in an awkward spot. They'll certainly be useful, but were we to try to place these constraints in relevant standard library packages, there doesn't seem to be an existing home for them.

smasher164 commented 3 years ago

@zephyrtronium Given that math already holds information like MaxInt, I could see the integral constraints being placed there.

And I agree we should explore putting constraints in the stdlib differently from the way C++ provides generic algorithms and concepts in std::algorithms and <concepts>. Go is really good about putting definitions where they are most suitable.

Although one might distinguish types and metatypes as being on different levels of some hierarchy, a user might expect the constraint to be in a package with related types and operations. A type that implements sort.Interface's Less method and an Ordered type are conceptually similar.

gudvinr commented 3 years ago

This package is named "constraints", and it contains type constraints. Nothing else.
I picked the constraints that seemed most likely to be useful.

Your suggested interfaces only related to numeric types. So, when some developer will come up with another likely useful constraint for string manipulation he will probably put it into strings package.
But since it also likely useful and also constraint it might end up in constraints.

So it will either end up as ioutil with bunch of unrelated stuff or will contain only interfaces related to numeric types. But if it only contains numeric constrains maybe math and its subpackages is a better place for such constraints.

komuw commented 3 years ago

I think we should add Comparable to the constraints package even it is just a type alias to the builtin comparable.
This is so that you do not have to explain to newbies why they have to import some constraints from package constraints but they do not have to do the same for comparable

The same logic can be extended to the predeclared any constraint.

DeedleFake commented 3 years ago

@komuw I feel like they'll become very annoying to use if the standard is to import it, especially from a package with a long name like constraints:

type Map[K comparable, V any] struct { ... }
// vs.
type Map[K constraints.Comparable, V constraints.Any] struct { ... }

I also think that it'll be more confusing for new people if you have to explain why both constraints.Comparable and comparable exist and are completely identical, not to mention they could run into some code that uses one and some that uses the other and if they've never seen it before they'll probably get confused trying to figure out the difference.

Edit: Since the new idea is to think of them as sets of types as per #45346, maybe it would make more sense to name it something type set related to try to get it shortened a bit? I'm not sure what to name it, though. sets seems too generic, especially if container/set ever winds up being a thing, and tsets is just not right.

DeedleFake commented 3 years ago

@gudvinr As proposed it already contains non-numeric types. Ordered should have ~string in it.

meling commented 3 years ago

Having the constraints in one location promotes discoverability, but I agree with the other commenters that it feels more Go-like to keep them in relevant packages like math and strings etc.

ianlancetaylor commented 3 years ago

Personally I continue to think that it makes more sense for to have a single constraints package that contains common constraints. I think it is simpler to write

func F[T constraints.Integer]() { ... }

then to write

func F[T math.IntegerConstraint]() { ... }

I don't think it can just be math.Integer; that doesn't convey anything to the reader.

meling commented 3 years ago

Good point. I agree. But constraints is a bit long, especially with multiple constraints. I suspect many people will import them with . to avoid the long package name.

gudvinr commented 3 years ago

I don't think it can just be math.Integer; that doesn't convey anything to the reader.

Why though? You can only use constraints in type parameters. So it does convey that it is a constraint.

meling commented 3 years ago

I don't think it can just be math.Integer; that doesn't convey anything to the reader.

Why though? You can only use constraints in type parameters. So it does convey that it is a constraint.

Not when looking at the doc; there they are interfaces.

gudvinr commented 3 years ago

Not when looking at the doc; there they are interfaces.

Their location doesn't restrict you from using constraints.Integer as an interface if language allows it.
There's no package interfaces with common interfaces after all.

zephyrtronium commented 3 years ago

Their location doesn't restrict you from using constraints.Integer as an interface if language allows it.

Related to this, I think it is important to consider future language changes. If Go gets sum types that use type constraint syntax, then it no longer strictly makes sense to call the types in the proposed package just constraints. Contrarily, names like sort.TypeSet are future-proof in this sense.

DeedleFake commented 3 years ago

Maybe types makes more sense as a package name? It fits with the naming (types.Signed, types.Ordered, etc.), fits with the idea of type sets, and works with type set interfaces potentially becoming available as sum types later.

cespare commented 3 years ago

I think that if people are referring to constraints so frequently that typing constraints. becomes a hassle, there are other, deeper problems than the naming of the package.

We should expect that defining parameterized types and functions ought to be a lot less common than using them in real-world code that does useful things, and thus we need not optimize for code which excessively references the constraints package.

tooolbox commented 3 years ago

I don't support this proposal as given; my suggestion would be to revisit it some time after generics are added to the language. A couple of releases later, there may be enough constraints in the wild to analyze which are the most useful and canonicalize them by adding them to the stdlib. Conjecture about what people will use or not use seems premature, because once in the stdlib it can't change. Wait for 5 packages to have a Signed constraint before we standardize it.

ianlancetaylor commented 3 years ago

my suggestion would be to revisit it some time after generics are added to the language.

Certain kinds of generics are going to be annoying to write without this proposal. We can remove constraints if they don't seem useful, but I believe we at least need constraints.Ordered if nothing else.

tooolbox commented 3 years ago

Ordered looks like this, in the old syntax:

// Ordered is a type constraint that matches any ordered type.
// An ordered type is one that supports the <, <=, >, and >= operators.
type Ordered interface {
    type int, int8, int16, int32, int64,
        uint, uint8, uint16, uint32, uint64, uintptr,
        float32, float64,
        string
} 

That doesn’t seem atrocious to write in a package that needs it, does it? As an exercise, I’m curious generally how many functions you predict will need it? The example of Smallest is given in the proposal but you must be thinking of others.

fzipp commented 3 years ago

I once suggested the name is for a constraints package on the mailing list.

// Smallest returns the smallest element in a slice.
func Smallest[T is.Ordered](s []T) T

// Double returns a new slice that contains all the elements of s, doubled. 
func Double[E is.Number](s []E) []E
fzipp commented 3 years ago

I think that if people are referring to constraints so frequently that typing constraints. becomes a hassle, there are other, deeper problems than the naming of the package.

We should expect that defining parameterized types and functions ought to be a lot less common than using them in real-world code that does useful things, and thus we need not optimize for code which excessively references the constraints package.

To me it's about readability. The parameterized function signatures will be shown in the documentation. The typographic bulkiness of the word constraints might visually drown the rest of the signature and distract from the actual relevant parts.

gudvinr commented 3 years ago

I once suggested the name is for a constraints package on the mailing list.

What about poor souls with internal methods as (Whatever[T]).is() boolor variables is := IsDouble(another)?
is.XXX visually just isn't very distinctive from IsXXX or isXXX which might be a problem too.

fzipp commented 3 years ago

I once suggested the name is for a constraints package on the mailing list.

What about poor souls with internal methods as (Whatever[T]).is() boolor variables is := IsDouble(another)? is.XXX visually just isn't very distinctive from IsXXX or isXXX which might be a problem too.

Technically it's not a problem because of scopes and shadowing. Constraints are only used in type parameter lists or embedded in other constraints, so there's little room for mistaking the package name for something else.

jmank88 commented 3 years ago

Some 4 letter package name options:

Or a 6 letter, based on the description language:

:shrug:

urandom commented 3 years ago

I'm not really happy about the name. I'm not really sure why it's plural, for one. It's also too lengthy. Consider the various packages under container. They are separate, not defined under container itself, and you don't end up with container.List or container.Heap. With such a precedent, you could create a constraint/is package, and put any necessary constraints there. You'll end up with a type constraint that's a readable is.Ordered

nemith commented 3 years ago

Bike-shedding of the name aside, I think having some well defined constraints in the standard library will greatly improve the readability of type-paramaterized functions/structs, etc.

On the bike-shedding I would rather have a longer descriptive name than a "clever" name like is, only or anys. Coming in new to a languague you shouldn't need to explain why there is a non-descriptive name for a specific use.

uluyol commented 3 years ago

If typesets eventually become useable as unions, I don't think the name "constraints" will age well if you want to pass around any integer value.

Considering that "builtintypesets" is too long, what about calling the package "typesets"?

fzipp commented 3 years ago

Another point regarding naming (if not now, then when): Singular form makes more sense when reading, because we refer to one constraint at a time: constraint.Ordered

Package names in the standard library are usually not pluralized unless there is a name clash with a predeclared type to be avoided (bytes, errors, strings): flag, builtin, list, ring, context, expvar, image, color, palette, path, plugin, template.

jimmyfrasche commented 3 years ago

The only issue that I see is using constraints.Ordered for an n-ary generic Min: Min[T]() for any Ordered T should be defined as the greatest element of T but there is no such element for string. (There is no issue with Max[string]() which would return the least element which is simply the empty string).

Min could define its own constraints similar to Ordered but without containing string but then that constraint would need to be updated if int128 or somesuch type is added.

There would need to be some mechanism for getting the greatest element of T that would itself need to be updated if something like int128 were added, of course.

If n-ary generic Min is added to std presumably it could take care of defining the appropriate constraint (but then where does this go and what is it called?)

I'm not sure if this is really a problem, but, if it is, one option is to have another constraint that's "Ordered but without string" or a constraint of all numbers so that its intersection with Ordered removes string and the complex numbers.

A stdlib Min[string]() could also just be defined to panic in that one weird case but then it couldn't be used as the mechanism to easily grab the greatest element of a type parameter unless the code calling it had its own "Ordered but not string" constraint to guard against the potential panic.

nemith commented 3 years ago

The only issue that I see is using constraints.Ordered for an n-ary generic Min: Min[T]() for any Ordered T should be defined as the greatest element of T but there is no such element for string. (There is no issue with Max[string]() which would return the least element which is simply the empty string).

Strings still have a < operator that works on them. It may not be all that useful, but I am not sure why that should be an issue here. There could be a constraints.Numeric that would include all numeric types but leave out strings. I don't think this proposal is showing all potential contraints, but and example of what a constraints package would have.

There would need to be some mechanism for getting the greatest element of T that would itself need to be updated if something like int128 were added, of course.

You should be able to embed contraints's interfaces into your own custom interfaces which would serve the same purpose. (However I am not sure of the rules of how they would combined)

"Ordered but not string" constraint to guard against the potential panic.

Why would it panic. You can call < on a string and it's valid code https://play.golang.org/p/-Qsj433mR_R

Again calling min against a string may not make sense, but it still works and engineering a solution around not allowing it seems like a waste of time.

Merovius commented 3 years ago

@jimmyfrasche Another solution is to make the signature Min[T constraints.Ordered](a T, b ...T) T. That way there is always a well-defined minimum in the list, though it's more awkward to call with a slice. A third solution might be to just say that you have to have at least one argument and panic if you pass an empty slice - less type-safe, but maybe okay.

Merovius commented 3 years ago

@nemith

Strings still have a < operator that works on them. It may not be all that useful, but I am not sure why that should be an issue here.

@jimmyfrasche's point is that Min and Max conventionally return the maximum and minimum possible element when passed an empty list. Just like an empty product is conventionally one and an empty sum is conventionally zero. An operation evaluating to its neutral element, when used with an empty list, makes many somewhat common expressions easier to write. And the neutral element of Min is the maximum element (Min(a, ∞) = a) and similar for Max.

The issue with string is that there is no "largest string".

jimmyfrasche commented 3 years ago

@Merovius that would work though then you'd have to do Min(slice[0], slice[1:]...) and you lose the ability to call Min[T]() to get the greatest element which can be used as infinity in your code.

Also there are probably plenty of cases where you want to use more arithmetic operations than just + on Ordered elements and you run into the same issue with having to frankenstein your own OrderedNumeric together if there's not an easy way to do that.

Merovius commented 3 years ago

Also there are probably plenty of cases where you want to use more arithmetic operations than just + on Ordered elements and you run into the same issue with having to frankenstein your own OrderedNumeric together if there's not an easy way to do that.

I think for functions that want to do arithmetic, we should have a Numeric constraint (I'm surprised that @ianlancetaylor didn't include one). "Ordered without string" would then be interface { Ordered; Numeric }, which seems reasonably convenient and to express clearly what you want. We could consider adding that as well, though (someone suggested calling that Real, which might be the best option, though I might object on philosophical grounds :) ).

Merovius commented 3 years ago

And FWIW:

you lose the ability to call Min[T]() to get the greatest element which can be used as infinity in your code.

That seems obscure to me anyways - the name of the function would say the opposite of what it does. I understand the logic behind it, but I think it's significantly less clear than having a Largest[T Real]() T and Smallest[T Ordered]() T or whatever. Which could be separately constrained from Min/Max, which would also make the differences clear.

rsc commented 3 years ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

bcmills commented 3 years ago

I'm a bit concerned about the Ordered constraint. If a user writes the Ordered constraint, I doubt they'll be thinking about NaNs — especially if nothing in their constraints mentions float explicitly, and especially if they only see one Ordered type in the constraints package.

“Ordered”, in the mathematical sense of a partially ordered set, is reflexive: x <= x for all x. However, for floating-point types, it is not the case that x <= x when x is a NaN value, which may be surprising in packages that (for example) want to implement a stable sorting algorithm.

So I wonder if it would make sense to leave out the Ordered constraint, especially since it can already be written fairly concisely (as interface { constraints.Integer | constraints.Float | ~string }, using the syntax in #45346).

ianlancetaylor commented 3 years ago

My sense, based on sample code I've seen so far, is that Ordered will be the most commonly used of the constraints defined in this package. I think it would be a mistake to leave it out. But perhaps we could omit the floating point types from Ordered, to force people to consider NaNs. Then we could consider adding another constraint that is Ordered | Float, though I'm having a hard time thinking of the right name for that: OrderedOrNaN?

OneOfOne commented 3 years ago

As I mentioned in #45039, I'm in favor of making them just global types, I understand that they don't need to be implemented in the compiler but it's just a lot cleaner from a programming-standpoint.

Specially for Ordered.

ordered, numeric, signed, unsigned, float, complex

func Smallest[T ordered](s []T) T // vs
func Smallest[T constraints.ordered](s []T) T

type Something[T numeric] struct {....}
func (s *Something[T numeric]) Method() {} // vs

type Something[T constraints.Numeric] struct {....}
func (s *Something[T constraints.Numeric]) Method() {} 
twmb commented 3 years ago

For prior art, Rust has PartialOrd, which I've been partial to, rather than OrderedOrNaN.

Rust implements PartialOrd for f64 by returning None if left <= right and left >= right both return false.

Merovius commented 3 years ago

float64 is not a partial order though. Partial orders are still reflexive.

komuw commented 3 years ago

I would be okay with an Ordered constraint that includes floats, if its behaviour when NaN's are involved is clearly documented on the docstring.

Analogous to the documented behaviour of NaN in sort.Float64 : https://golang.org/pkg/sort/#Float64s

bcmills commented 3 years ago

It's unfortunate that the current spec already uses the term “ordered” for types that support <=, but maybe that's enough of a reason to name the constraint Ordered.

I suppose that the built-in comparable constraint in the approved proposal is analogous: it includes interface types, which have the similar problem that == is defined for the type but may panic at run-time for certain values.

benjaminjkraft commented 3 years ago

Is adding types in the future going to cause compatibility problems? For example:

type MySigned interface { type int8, int16, int32, int64 }
func MyAbs[T MySigned](v T) T { ... }
func Abs[T constraints.Signed](v T) T { return MyAbs(v) }

(Of course presumably Abs is in some other package, and perhaps the type lists involved are more complex.) That compiles today, but not in a future Go version where constraints.Signed contains int128. It doesn't seem like this is enough to block the addition of constraints, and I don't see an obvious way to avoid it, but it's at least a bit awkward.

Merovius commented 3 years ago

@benjaminjkraft I don't think it's fully clear under what circumstances it will be allowed to instantiate a generic function using a type parameter. Especially if #45346 is accepted. I could well imagine that we'll have to allow your example, as long as int128 supports the same operators as the other int* do.

Either way, I think the ability to do this kind of extension seems important enough that we should make a stability exception in the form of a disclaimer for it. But it's definitely an interesting question.

deanveloper commented 3 years ago

I propose one more type to be added:

type Arithmetic interface { ... } // supports +, -, *, and /

This is useful for functions which can take integers, floats, and complex numbers as inputs, such as the quadratic formula:

func Quadratic[T constraints.Arithmetic](a, b, c T) (T, T) {
    return (-b + sqrt(b*b - 4*a*c)) / (2*a), (-b - sqrt(b*b - 4*a*c)) / (2*a)
}

(where sqrt has signature func sqrt[T constraints.Arithmetic](in T) T and does what you expect)

Another spelling could be Number, but I figured Arithmetic was less easily mistakable, one might assume that Number is orderable.

EDIT - quadratic formula is a bit of a bad example for integers because of strange interactions with division and square roots... some simple examples that work well with all 3 “kinds” of numbers may be Sum, Product, etc