proposal: Go 2: sum types using interface type lists

ianlancetaylor commented 4 years ago

This is a speculative issue for discussion about an aspect of the current generics design draft. This is not part of the design draft, but is instead a further language change we could make if the design draft winds up being adopted into the language.

The design draft describes adding type lists to interface types. In the design draft, an interface type with a type list may only be used as a type constraint. This proposal is to discuss removing that restriction.

We would permit interface types with type lists to be used just as any other interface type may be used. A value of type T implements an interface type I with a type list if

the method set of T includes all of the methods in I (if any); and
either T or the underlying type of T is identical to one of the types in the type list of I.

(The latter requirement is intentionally identical to the requirement in the design draft when a type list is used in a type constraint.)

For example, consider:

type MyInt int
type MyOtherInt int
type MyFloat float64
type I1 interface {
    type MyInt, MyFloat
}
type I2 interface {
    type int, float64
}

The types MyInt and MyFloat implement I1. The type MyOtherInt does not implement I1. All three types, MyInt, MyOtherInt, and MyFloat implement I2.

The rules permit an interface type with a type list to permit either exact types (by listing non-builtin defined types) or types with a particular structure (by listing builtin defined types or type literals). There would be no way to permit the type int without also permitting all defined types whose underlying type is int. While this may not be the ideal rule for a sum type, it is the right rule for a type constraint, and it seems like a good idea to use the same rule in both cases.

Edit: This paragraph is withdrawn. We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

In all other ways an interface type with a type list would act exactly like an interface type. There would be no support for using operators with values of the interface type, even though that is permitted when using such a type as a type constraint. This is because in generic code we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as +. With two values of some interface type, all we know is that both types appear in the type list, but they need not be the same type, and so + may not be well defined. (One could imagine a further extension in which + is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)

In particular, the zero value of an interface type with a type list would be nil, just as for any interface type. So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

As I said above, this is a speculative issue, opened here because it is an obvious extension of the generics design draft. In discussion here, please focus on the benefits and costs of this specific proposal. Discussion of sum types in general, or different proposals for sum types, should remain on #19412. Thanks.

Merovius commented 4 years ago

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

I don't understand this. If all types in the type list appear as cases, the default case would never, trigger, correct? Why require both?

Personally, I'm opposed to requiring to mention all types as cases. It makes it impossible to change the list. ISTM at least adding new types to a type-list should be possible. For example, if go/ast used these proposed sum types, we could never add new node-types, because doing so would break any third-party package using ast.Node. That seems counterproductive.

I think requiring a default case is a good idea, but I don't like requiring to mention all types as cases.

There is another related question. It is possible for such a sum-value to satisfy two or more cases simultaneously. Consider

type A int

type X interface {
    type A, int
}

func main() {
    var x X = A(0)
    switch x.(type) {
    case int: // matches, underlying type is int
    case A: // matches, type is A
    }
}

I assume that the rules are the same as for type-switches today, which is that the syntactically first case is matched? I do see some potential for confusion here, though.

griesemer commented 4 years ago

[edited]

@Merovius It does say "...and if there are any types in the type list that do not appear as cases in the type switch." Specifically, there is no comma between "default case" and "and". Perhaps that is the cause for the confusion?

Regarding the multiple cases scenario: I think this would be possible, and it's not obvious (to me) what the right answer here would be. One could argue that since the actual type stored in x is A that perhaps that case takes precedence.

tooolbox commented 4 years ago

It does say "...and if there are any types in the type list that do not appear as cases in the type switch."

Makes sense. I can see how the language is a little ambiguous, the point is it's a compile error if both of those conditions exist.

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

It occurred to me that tooling could spot when a type switch branch was invalid, i.e. the interface type list only contains A and B and your switch checks for C, but it seems best to not make that a compiler error. A linter could warn about it, but being overly restrictive here might harm backwards-compatibility.

Regarding the multiple cases scenario: I think this would be possible, and it's not obvious (to me) what the right answer here would be. One could argue that since the actual type stored in x is A that perhaps that case takes precedence.

I think it makes the most sense for the type switch to behave consistently. It's not clear to me how the type switch would be any different except that the interface being switched on has a type list. You can know at compile-time what branches should be in the switch, but that's it.

Overall I'm in favor, I think the proposal is right on the money. They function like any other interface, (no operators) and zero value is nil. Simple, consistent, unifies semantics with the Generics proposal. 👍

Merovius commented 4 years ago

@griesemer Ah, I think I understand now. I actually misparsed the sentence. So AIUI now, the proposal is to require either a default case or to mention all types, correct?

In that case, the proposal makes more sense to me and I'm no longer confused :) I still would prefer to require a default case, though, to get open sums. If it is even allowed to not have a default case, it's impossible to add new types to the type-list (I can't know if any of my reverse dependencies does that for one of my exported types, so if I don't want to break their compilation, I can't add new types). I understand that open sums seem less useful to people who want sum types, though (and I guess that's at the core of why I consider sum types to be less useful than many people think). But IMO open sums are more adherent to Go's general philosophy of large-scale engineering and the whole gradual repair mechanism - and also more useful for almost all use-cases I see sum types suggested for. But that's just my 2¢.

griesemer commented 4 years ago

@Merovius Yes, your new reading is correct.

mvdan commented 4 years ago

In all other ways an interface type with a type list would act exactly like an interface type. [...] So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

Could you clarify why nil should always be an option in such sum types? I understand this makes them more like a regular interface, but I'm not sure if that consistency benefit outweighs how it makes them less useful.

For example, they could be left out by default, or included by writing nil or untyped nil as one of the elements in the type list.

I understand that the zero value gets trickier if we remove the possibility of nil, which might be the reason behind always including nil. What do other languages do here? Do they simply not allow creating a "zero value" of a sum type?

mvdan commented 4 years ago

To add to my comment above - @rogpeppe's older proposal in https://github.com/golang/go/issues/19412#issuecomment-288485048 does indeed make nil opt-in, and the zero value of the sum type becomes the zero value of the first listed type. I quite like that idea.

jimmyfrasche commented 4 years ago

@mvdan as far as I'm aware other languages with sum types do not have the notion of a zero value and either require a constructor or leave it undefined. It's not ideal to have a nil value but getting something that works both as a type and a metatype for generics is worth the tradeoff, imo.

Merovius commented 4 years ago

I guess (as a nit) nil should also be a required case if no default case is given, if we make nil a valid value.

jimmyfrasche commented 4 years ago

So this https://go2goplay.golang.org/p/5L7T8G9rfLD would print "something else" under the current proposal, correct? The only way to get that value is reflect?

ianlancetaylor commented 4 years ago

@jimmyfrasche Correct. This proposal doesn't change the way that type switches operate, except for the suggested error if there are omitted cases.

Merovius commented 4 years ago

So that means this code would panic? https://go2goplay.golang.org/p/vPC-qtKb7VO That seems strange and as if it makes these sum types significantly less useful.

jimmyfrasche commented 4 years ago

I'd like to reiterate my earlier suggestion: https://groups.google.com/g/golang-nuts/c/y-EzmJhW0q8/m/XICtS-Z8BwAJ

tooolbox commented 4 years ago

I'd like to reiterate my earlier suggestion: https://groups.google.com/g/golang-nuts/c/y-EzmJhW0q8/m/XICtS-Z8BwAJ

Not terribly excited about the new syntax.

That seems strange and as if it makes these sum types significantly less useful.

Well, it panics without the type list. But I get your point, the interface then allows a value for which the compiler won't enforce a branch in a type switch.

Could we perform implicit conversion to one of the listed types when you assign into the interface? The only case I can think of where that's weird is when the interface has methods that those underlying types don't have, i.e. you have type Foo interface{ type int; String() string }, so implicit conversion to int itself violates the interface.

While I really like the idea of unifying interface semantics by allowing type lists in interfaces used as values, rather than just as constraints, perhaps the two use cases are different enough that the interfaces you'd use for each vary significantly. Maybe this problem we're discussing isn't one that we'd encounter in real code? It might be time to break out some concrete examples.

jimmyfrasche commented 4 years ago

Any explicit syntax would work. I just had to choose something semi-reasonable to write the idea down. At any rate, it wouldn't need to be used very often but having the choice let's everything work reasonably without either use being hindered by the existence of the other.

ianlancetaylor commented 4 years ago

@Merovius Correct: that code would panic.

I think it's worth discussing whether that would in fact make these sum types significantly less useful. It's not obvious to me, because it's not obvious to me that sum types are often used to store values of types like int. I agree that if that is a common case, then this form of sum types is not all that useful, but when does that come up in practice, and why?

jimmyfrasche commented 4 years ago

You could always work around it by using type LabeledInt int in the sum but that means having to create additional types. fwif json.Token is a sum type in the standard library that takes bool and float64 and string

Merovius commented 4 years ago

@ianlancetaylor Point taken. I can't personally really provide any evidence or make any strong arguments, because I'm not convinced sum types in and off itself are actually very useful :) I was trying to extrapolate. Either way, I also find it objectionable on an aesthetic level, to single out predeclared types in this way - but that's just subjective, of course.

neild commented 4 years ago

Regarding changing a type list being a breaking change: If the type list contains an unexported type, then the rule in @ianlancetaylor's proposal effectively requires that all type switches outside the package containing the sum type contain a default case.

For example,

package p
type mustIncludeDefaultCase struct{}
type MySum interface {
  type int, float64, mustIncludeDefaultCase
}

Regarding nil-ness of sum types: I find it strange that the proposed rules require type switches to exhaustively cover the possible types in the sum or include a default case, but don't require covering the nil case.

type T interface { type int16, int32 }
func main() {
  var x T

  // None of these cases will execute, because x is nil.
  switch x.(type) {
  case int16:
  case int32:
  } 
}

I personally would prefer a design in which the zero value of a sum is the zero value of the first type in the sum. It is easy to add an additional "nothing here" case when desired, but impossible to remove a mandatory nil case when not.

rogpeppe commented 4 years ago

In general I'm in favour of this proposal, but I think there are some issues that need to be solved first.

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

If type switches aren't changed at all, then I don't see how this rule is useful. It feels like it's attempting to define that switches on type-list interfaces are complete, but it's clear that they can never be complete when the type list contains builtin types, because there are any number of other non-builtin types there could be.

In general, even putting the above aside, I don't think the requirement for a type switch statement to fully enumerate the cases or the requirement to have a default fits well with the rest of the language. It's common to just "fall off the bottom" of a switch statement if something doesn't match, and that seems just as apt to me for a type-list type switch as with any other switch or type switch. In general, the rule doesn't feel very "Go-like" to me.

What about type assertions involving type-list interfaces. Can I do this?

type I1 interface {
    type string, []byte
}
var x I1 = "hello"
y := x.(string)

If not, why not? If so, why is this so different from a type switch with a single case and no default branch?

What about this (type asserting to a type list interface) ?

x := y.(I1)

If that works, presumably this could be used to test the underlying type of the dynamic type of an interface, which is something that's not easy to do even with reflect currently.

The rules permit an interface type with a type list to permit either exact types (by listing non-builtin defined types) or types with a particular structure (by listing builtin defined types or type literals). There would be no way to permit the type int without also permitting all defined types whose underlying type is int. While this may not be the ideal rule for a sum type, it is the right rule for a type constraint, and it seems like a good idea to use the same rule in both cases.

I understand why this rule is proposed - using the same rule in both cases is important for consistency and lack of surprises in the language. However, ISTM that this rule gives rise to almost all the discomfort I have with this proposal:

we can switch on all the types in the type list without having a guarantee of getting a match
there's a many-to-one correspondence between the types named in the interface type and the dynamic types that the interface can take on

If we don't allow an interface type with a type list to match underlying types too, then you end up with surprises with assignments in generic functions. For example, this wouldn't be allowed, because F might be instantiated with a type that isn't int64 or int:

type I interface {
    type int64, int
}

func F[T I](x T) I {
    return x
}

How about working around this by adding the following restriction:

Only generic type parameter constraints can use type-list interfaces that contain builtin types.

So the above example would give a compile error because I, which contains a builtin type, is being used as a normal interface type.

The above restriction would make almost all the issues go away, I think - albeit at the cost of some generality.

There would be no support for using operators with values of the interface type, even though that is permitted when using such a type as a type constraint. This is because in generic code we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as +. With two values of some interface type, all we know is that both types appear in the type list, but they need not be the same type, and so + may not be well defined. (One could imagine a further extension in which + is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)

Allowing operators is only a problem for binary operators AFAICS. One thing that might be interesting to allow is operators that do not involve more than one instance of the type. For example, given:

type StringOrBytes interface {
     type string, []byte
}

I don't think that there would be any technical problem with allowing:

var s StringOrBytes = "hello"
s1 := s[2:4]
n := len(s)

In particular, the zero value of an interface type with a type list would be nil, just as for any interface type. So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

I think that always having nil as a possibility is a bit of a shame and I don't think it's absolutely necessary as @mvdan pointed out, but I'd still support this proposal even with nil, for the record.

bcmills commented 4 years ago

As far as I can tell, this proposal nearly parallels the defined-sum interface types in my previous writeup.

There is one key difference that I would like to explore: assignability. To me, assignability is what makes interface types coherent at all: it is what causes interface types to have regular subtyping relationships, which other Go types lack.

This proposal does not mention assignability at all. That seems like an oversight. In particular:

If all of the types in the sum type S are defined (not underlying) types, and all of those types implement the methods of an ordinary interface type I, should a variable of type S be assignable to I?
If all of the types in the sum type S1 are also in the sum type S2, should a variable of type S1 be assignable to S2? (This is especially important when I is itself a type-list interface: it seems clear to me that a variable of type interface { int8, int16 } should be assignable to a variable of type interface { int8, int16, int32 }.)
Should a variable of any sum type be assignable to interface{}?

@mvdan: for me, the assignability properties are what lead to the conclusion that all sum types should admit nil values. It would be strange for the zero-value of type interface { int8, int16 } to be different for the zero-value of type interface{} if the former is assignable to the latter, and it would be even stranger for the zero-value of type interface { *A, *B } to be assignable to an interface implemented by both *A and *B but to have a non-nil zero-value.

bcmills commented 4 years ago

I believe that this proposal is compatible with (in the sense that it would not preclude later addition of) the sum interface types from my previous writeup, which I think are a closer fit to what most users who request “sum types” have in mind.

bcmills commented 4 years ago

We propose further that in a type switch on an interface type with a type list, it would be a compilation error if the switch does not include a default case and if there are any types in the type list that do not appear as cases in the type switch.

This constraint seems like a mistake to me. Due to the underlying-type expansion, even a switch that enumerates all of the types in the type switch could be non-exhaustive.

I think that either the default constraint should be dropped, or a default case should also be required for any type-list interface that includes any type that could be an underlying type (that is, any literal composite type and any predeclared primitive type).

deanveloper commented 4 years ago

Would unary operators be defined on the types if possible? ie:

type Int interface {
    type int, int32, int64
}
func Neg(i Int) Int {
    return -i
}

I would assume not since unary operators are defined to expand to binary operators, however the unary operators are valid to use since the other operand is untyped. Although this could result in unexpected behavior for sum types with uint types included.

bcmills commented 4 years ago

Finally, I would like to note that this proposal on its own would expose an (existing) inconsistency in the generics design draft: the semantics of a type-list interface used as a constraint would be markedly different from the semantics of any other interface type used as a type constraint.

In particular, unlike other interface types, a type-list interface would not satisfy itself as a type constraint. If it did, then the type constraint interface { int8, int16 } would not be sufficient to allow the use of mathematical operators on the constrained type, because the type interface { int8, int16 } itself does not have defined mathematical operators.

(See https://github.com/bcmills/go2go/blob/master/typelist.md#type-lists-are-not-coherent-with-interface-types for more detail.)

ianlancetaylor commented 4 years ago

Clearly people do not like the type switch part of this proposal, so let's consider that to be withdrawn. It's not the important part.

ianlancetaylor commented 4 years ago

@rogpeppe

How about working around this by adding the following restriction:

Only generic type parameter constraints can use type-list interfaces that contain builtin types.

We could definitely do that, but I think it's an awkward requirement. The point of this proposal, as I see it, is to keep the rules of the design draft and extend them for sum types. I think that if we start tweaking the rules, we lose the benefit of this proposal, and it would probably be better to consider other approaches for sum types.

That is, I think it's fine if we say "this proposal doesn't give us what we want for sum types, so let's not adopt it." But I would argue that if we say "let's adopt this proposal but make interfaces-with-type-lists behave differently when used as type constraints and when used as ordinary types," then this proposal is no longer providing a benefit that is worth the cost. Better than that would be to adopt a different version of sum types that is not easily confused with type constraints. (Or, of course, not adopt sum types at all.)

ianlancetaylor commented 4 years ago

@bcmills

If all of the types in the sum type S are defined (not underlying) types, and all of those types implement the methods of an ordinary interface type I, should a variable of type S be assignable to I?

We could certainly discuss that possibility, but, in the current proposal, no.

If all of the types in the sum type S1 are also in the sum type S2, should a variable of type S1 be assignable to S2? (This is especially important when I is itself a type-list interface: it seems clear to me that a variable of type interface { int8, int16 } should be assignable to a variable of type interface { int8, int16, int32 }.)

Yes.

Should a variable of any sum type be assignable to interface{}?

Yes.

(The other interesting cases are what happens with embedding, which it outlined at https://go.googlesource.com/proposal/+/refs/heads/master/design/go2draft-type-parameters.md#type-lists-in-embedded-constraints)

ianlancetaylor commented 4 years ago

@bcmills

I believe that this proposal is compatible with (in the sense that it would not preclude later addition of) the sum interface types from my previous writeup, which I think are a closer fit to what most users who request “sum types” have in mind.

I think that idea is fine, but I want to make clear that in my opinion, if we think we are going to adopt that idea, then we should not adopt this one. We don't need both. Saying that type lists in interface types are only permitted in type constraints is definitely a wart, but I think that wart would be better than having two similar but slightly different ways of defining sum types.

Merovius commented 4 years ago

I tend to dislike the idea of having both type-lists for constraints and a separate mechanism for sum-types. That is, I agree that if we add sum-types, they should re-use the mechanism of type-lists for constraints. And if we don't like the semantics that gives us for sum-types, it might be worth it to reconsider the mechanics of type-lists for constraints as well? I know that this isn't an attractive idea, though, because the generics proposal has been discussed at length already.

ianlancetaylor commented 4 years ago

It's not too late to change the generics design draft, if anybody has any specific suggestions that are clearly better than what we are doing now.

(The current semantics of type lists were in fact chosen to make this proposal possible, but that doesn't mean that there aren't better semantics.)

jimmyfrasche commented 4 years ago

I don't think that there will be one rule that can satisfy both uses. Type lists could always use identical types when used as sum types but then there are two rules. If there's an explicit syntax to annotate which kind of matching to use for items in a type list, you could satisfy both uses. Individual interfaces might not always make sense as both a sum type and a constraint but that's fine and will likely be true in practice regardless.

bcmills commented 4 years ago

@Merovius, type constraints are necessarily concerned with the space of allowed operations, whereas sum-types are concerned with the space of allowed values. The two are related but — especially given the existence of binary operations — I think they cannot be unified.

(Or, to put it another way: many of the use cases for sum types require a sum type to be a finite set, whereas many of the use cases for generics require the constraint to match an infinite set.)

rogpeppe commented 4 years ago

It's not too late to change the generics design draft, if anybody has any specific suggestions that are clearly better than what we are doing now.

If I were to suggest anything, it would be to avoid the "underlying type matching" semantics completely from type list interfaces. I think they're the source of a lot of the harder problems with the proposals, and I'm not convinced they really provide that much added value.

griesemer commented 4 years ago

@rogpeppe Interesting point. I will add that we could always add the "underlying type matching" rule later (at least for constraints) if it turned out to be important. But we couldn't take it away later.

On the other hand, consider the case of a generic min function: It would be a bit sad if we couldn't use it to compute the minimum of say two temperatures; e.g., defined as type Kelvin float32. More generally, anytime people defined a special type to express a unit, this would break down.

jimmyfrasche commented 4 years ago

If the underlying type matching rule is introduced after sum types that would mean they follow different rules. I can't imagine it would be backwards compatible to change the matching rules for sum types like it would be for generics.

You could introduce an annotation for types in a type list that should follow the "underlying type matching" rule later but then the constraints, slices, etc. packages could face backwards compatibility issues trying to integrate those.

deanveloper commented 4 years ago

type constraints are necessarily concerned with the space of allowed operations, whereas sum-types are concerned with the space of allowed values. The two are related but — especially given the existence of binary operations — I think they cannot be unified.

In #27605, the primary syntax proposal was to use operator (T + T) AdderFuncName to define operator functions. I'm not advocating for operator functions here, but I think this syntax (or a similar one) would also be good to define operators in interfaces. For instance:

type Ordered[T] interface {
    operator(T < T)
}

This could be a good way to define constraints on operators, and then the interface { type t1, t2, ... } or something similar could be used for sum types and use exact type matching.

The constraint in practice (with type parameter inference) could look something like:

func Min[T Ordered](slice []T) T { ... }

tooolbox commented 4 years ago

Regarding changing a type list being a breaking change: If the type list contains an unexported type, then the rule in @ianlancetaylor's proposal effectively requires that all type switches outside the package containing the sum type contain a default case.

Makes sense to me.

Clearly people do not like the type switch part of this proposal, so let's consider that to be withdrawn. It's not the important part.

Without this, I'm not sure what the benefit is of these sum types. The proposal then boils down to "you can use an interface with a type list outside of type constraints". That's fine and improves consistency within the language, but if we don't even attempt to give the developer tools to ensure he's handling the appropriate set of values, I don't know that we should call them sum types.

If there's an explicit syntax to annotate which kind of matching to use for items in a type list, you could satisfy both uses.

With all the above discussion highlighting the tradeoffs and problems, I think I've come around to like at least the spirit of this from @jimmyfrasche and his earlier suggestion. If type switches had a way to match on "all types with the underlying type of X" then we could guarantee the completeness of a switch even when a type list contains predeclared types. Something like this, which is basically an inversion of the syntax @jimmyfrasche proposed:

switch v.(type) {
case float32, float64: // exact matches
case string...: // matches the first type in the type list with an underlying type of string
case string: // exact match
}

Note that it's backwards-compatible, and could also perhaps be used for type assertions.

griesemer commented 4 years ago

@deanveloper The problem is not that we couldn't introduce operators in constraints, the problem is that operators alone don't address all problems. We would also need to invent notation for which conversions are permitted, and notation to express permissible constant values: how would we specify that one can assign a string or an integer constant to a value of type parameter type? How do we express that we need to be able to assign values of up to 1234 to a variable of type parameter type? Etc. Type lists elegantly solve this problem, which is why we eventually zoomed in on them.

griesemer commented 4 years ago

Looking from the opposite point of view, if we already had (somehow sensibly defined) sum types of sorts, how would they be different from interface types? Would they be sufficiently different from interfaces further constrained by type lists?

jimmyfrasche commented 4 years ago

@griesemer they could be like discriminated unions: structs that only allow one field to be set at a time. That's very different. In a vacuum I'd prefer that, but if type lists can be reused that wins out.

@tooolbox the difference between an interface with a type list and without is additional compile time safety and tools will be able to read the type list and tell you if you missed a case (you can do this now but you need someway outside Go to say to check the interface and what types are permissible)

tooolbox commented 4 years ago

@tooolbox the difference between an interface with a type list and without is additional compile time safety and tools will be able to read the type list and tell you if you missed a case (you can do this now but you need someway outside Go to say to check the interface and what types are permissible)

I think you misunderstood; I understand that and agree with that. It seems to me that read(ing) the type list and tell(ing) you if you missed a case would be done in a type switch, and @ianlancetaylor seemed to be withdrawing any effort to achieve safety/completeness in that case. I took that to mean that type lists would then only be enforced when assigning a value to an interface, which is fine, but it seems like any sum types worthy of the name could offer some kind of completeness SLA at the site of "pattern matching" a.k.a. disambiguation a.k.a. type switching.

Stated another way, I'm supporting your earlier suggestion for a syntax to differentiate between matching exact types and matching underlying types in type switches, type assertions, etc. I know my initial reaction was negative, but it seems like a good way to make these sum types useful, while keeping unified semantics between the two different uses of interfaces, and preserving the flexibility we get by allowing "underlying type matching" for type lists.

griesemer commented 4 years ago

@jimmyfrasche How much is a discriminated union different from an interface constrained with a type list? I suspect a straight-forward implementation would be the same for both (sum types would represented like interfaces internally). We'd expect in both cases that we can do type switches and asserts. Maybe discriminated unions wouldn't have a nil value, and perhaps the underlying rule would be gone. Is there more?

(I am not saying that these two differences aren't crucial - perhaps they are - I'm just trying to understand if there's something else.)

ianlancetaylor commented 4 years ago

Note that even if the language does not check that all possible types appear as a type switch case, it would be straightforward for a static checker to do so.

jimmyfrasche commented 4 years ago

@griesemer I gave a thorough description in the other thread at https://github.com/golang/go/issues/19412#issuecomment-289588569 which references https://github.com/golang/go/issues/19412#issuecomment-289246888 and there is some good discussion surrounding those posts that github has decided to fold. I don't want to derail this thread with a counterproposal, but there are some significant differences between that and an interface-backed solution.

beoran commented 4 years ago

I like this proposal because it will solve several problems in Go in a very practial way.

Namely, it matches current best practise well, to make a sum type now, we define an interface and a limited list of types that implement that interface.

I have an example in my scripting language MUESLI (https://gitlab.com/beoran/muesli) that would benefit from this proposal.

Consider this function that converts built in Go types to muesli Values:

func (from FloatValue) Convert(to interface{}) error {
    switch toPtr := to.(type) {
        case *string:
            (*toPtr) = from.String()
        case *int8:
            (*toPtr) = int8(from)
        case *int16:
            (*toPtr) = int16(from)
        case *int32:
            (*toPtr) = int32(from)
        case *int64:
            (*toPtr) = int64(from)
        case *int:
            (*toPtr) = int(from)
        case *bool:          
            (*toPtr) = (from != 0)      
        case *float32:
            (*toPtr) = float32(from)
        case *float64:
            (*toPtr) = float64(from)
        case *FloatValue:
            (*toPtr) = from
        case *Value:
            (*toPtr) = from
        default:
            return NewErrorValuef("Cannot convert FloatValue value %v to %v", from, to)
    }
    return nil
}

I could get rid of interface{} here and change this to interface { type string, int8, int16, int32, int64, int, bool, float32, float64, FloatValue, Value }, to make it more clear to the caller that only certain types are allowed. EDIT: I assume the compiler will also type check the passed to variable then and error if it isn't one of the mentioned types? The great thing about this proposal is that it allows me to get rid of many interface{}, which are a constant source of problems, much like the void * pointer in C.

It would be even better if there was an option to make the type switch above exhaustive, so the default case is not needed. Maybe we could take advantage of the range keyword as new syntax for exhaustive type switches, like this:

switch range toPtr := to.(type) {
// EDIT: or maybe like this:
switch toPtr := range to.(type) {

In which case the type switch must mention all types mentioned in the interface, as well as nil, but a default case is not needed.

urandom commented 4 years ago

@griesemer

wouldn't the memory layout of a sum type be totally different from an interface? Presumably, it would be like a C union, where the largest member defines the total memory of the type, plus the tag indicating which type is actually in the sum. Of course, that would not be a straight-forward implementation. In terms of usage however, it's probably not that different.

rogpeppe commented 4 years ago

Presumably, it would be like a C union, where the largest member defines the total memory of the type, plus the tag indicating which type is actually in the sum.

I suspect you couldn't quite do that because then the GC would need to know about the tags, which would slow it down. You'd need to align the components so that pointers were in the same place. That still might be worth doing (consider a type list that mentions only types without pointers), and certainly the spec would want to leave the possibility open, even if the implementation was only "straight-forward" initially.

rogpeppe commented 4 years ago

@griesemer

On the other hand, consider the case of a generic min function: It would be a bit sad if we couldn't use it to compute the minimum of say two temperatures; e.g., defined as type Kelvin float32. More generally, anytime people defined a special type to express a unit, this would break down.

That's true to an extent, but it wouldn't be hard to provide an interface that lets such a special type opt in to ordering. For example: https://go2goplay.golang.org/p/axg6RrzLzbb

// Under can be implemented by a type to return itself
// as its underlying type.
type Under[T any] interface {
    Underlying() T
}

func MinU[T Under[U], U constraints.Ordered](a, b T) T {
    if a.Underlying() < b.Underlying() {
        return a
    }
    return b
}

We only have this problem with named types, and such a method can be added backwardly compatibly to a named type. To me a workaround like this seems preferable to making the entire generics proposal significantly more complex.

Of course, this could also be used to allow the MinU function to work on types that would otherwise not be comparable.

A down side of this approach is that it doesn't work for arithmetic operations such as addition. It's possible to work around that too though: https://go2goplay.golang.org/p/FXJ_ZfgAE3Z.

Another issue is that type inference doesn't work, but I suspect that could be worked around by allowing slightly more sophisticated type inference rules.

It could also be argued that it's also cleaner for non-builtin types to have to opt into arithmetic operations (does it make sense to use Min on an arbitrary enum-like type?) rather than automatically satisfying all the built-in operators.

bcmills commented 4 years ago

It could also be argued that it's also cleaner for non-builtin types to have to opt into arithmetic operations ….

Ooh, that's an excellent point! (Compare #30209, in which I would like to remove unchecked arithmetic operations from integer types.)

(I suppose you could also opt out of arithmetic operations by wrapping the type in a struct type, but if you do that then the type can no longer be initialized from constants, and can no longer be used for constants.)

golang / go

proposal: Go 2: sum types using interface type lists #41716