proposal: spec: sum types based on general interfaces

ianlancetaylor commented 1 year ago

This is a speculative issue based on the way that type parameter constraints are implemented. This is a discussion of a possible future language change, not one that will be adopted in the near future. This is a version of #41716 updated for the final implementation of generics in Go.

We currently permit type parameter constraints to embed a union of types (see https://go.dev/ref/spec#Interface_types). We propose that we permit an ordinary interface type to embed a union of terms, where each term is itself a type. (This proposal does not permit the underlying type syntax ~T to be used in an ordinary interface type, though of course that syntax is still valid for a type parameter constraint.)

That's really the entire proposal.

Embedding a union in an interface affects the interface's type set. As always, a variable of interface type may store a value of any type that is in its type set, or, equivalently, a value of any type in its type set implements the interface type. Inversely, a variable of interface type may not store a value of any type that is not in its type set. Embedding a union means that the interface is something akin to a sum type that permits values of any type listed in the union.

For example:

type MyInt int
type MyOtherInt int
type MyFloat float64
type I1 interface {
    MyInt | MyFloat
}
type I2 interface {
    int | float64
}

The types MyInt and MyFloat implement I1. The type MyOtherInt does not implement I1. None of MyInt, MyFloat, or MyOtherInt implement I2.

In all other ways an interface type with an embedded union would act exactly like an interface type. There would be no support for using operators with values of the interface type, even though that is permitted for type parameters when using such a type as a type parameter constraint. This is because in a generic function we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as +. With two values of some interface type, all we know is that both types appear in the type set, but they need not be the same type, and so + may not be well defined. (One could imagine a further extension in which + is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)

In particular, the zero value of an interface type with an embedded union would be nil, just as for any interface type. So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

As an implementation note, we could in some cases use a different implementation for interfaces with an embedded union type. We could use a small code, typically a single byte, to indicate the type stored in the interface, with a zero indicating nil. We could store the values directly, rather than boxed. For example, I1 above could be stored as the equivalent of struct { code byte; value [8]byte } with the value field holding either an int or a float64 depending on the value of code. The advantage of this would be reducing memory allocations. It would only be possible when all the values stored do not include any pointers, or at least when all the pointers are in the same location relative to the start of the value. None of this would affect anything at the language level, though it might have some consequences for the reflect package.

As I said above, this is a speculative issue, opened here because it is an obvious extension of the generics implementation. In discussion here, please focus on the benefits and costs of this specific proposal. Discussion of sum types in general, or different proposals for sum types, should remain on #19412 or newer variants such as #54685. Thanks.

Merovius commented 7 months ago

And FWIW

Respecting the existing nature of the Go language, I am arguing that since it is impossible to find a perfect solution, maybe instead we could be pragmatic and accept a really good one?

I assume that the pragmatic solution we will eventually adopt (if any) is to use union-element interfaces as variants and make their zero value nil. To be clear, that's not my favorite solution, just the one that seems most pragmatic, given where we are. But it requires accepting the bad that comes with it and I don't resent the fact that we don't do that lightly.

ianlancetaylor commented 7 months ago

@mrwonko Thanks. In these kinds of discussions, it is always possible to find a solution for any given problem. But it is also necessary to step back and consider the overall picture. Go is intended to be a reasonably simple, reasonably orthogonal language. When we add special cases we weaken those properties.

This proposal is, I think, a somewhat simple, reasonably orthogonal, change that we could make. The question here is not how to complicate it to make it better. We're almost certainly not going to do that. Rather than make it more complicated, we will choose to make no change at all. The question here is whether to make this change at all--that is, whether the benefits of the change are worth adding more complexity to the language. Or perhaps we can find a way to make it more simple and more orthogonal.

ngortheone commented 7 months ago

Odin lang is in many ways inspired by golang.

Odin has enum type that expresses sum type idea. To instantiate an enum variable one has to spell out the concrete type

Foo :: enum {
    A,
    B,
    C,
    D,
}

f := Foo.A

https://odin-lang.org/docs/overview/#partial-switch

It is true that if golang tries to implement sum types via extending/overloading interface complications are guaranteed. But what about creating a separate keyword enum ? This helps to sidestep a lot of complications that come from interface

type Foo enum {
    A string
    B int
}

f := Foo.A // OK
b := Foo   // Compile failure, unspecified concrete type

Merovius commented 7 months ago

@ngortheone Note that this issue is specifically about using union-elements in interfaces as variants. There are other issues (my personal favorite is #54685) to discuss other ideas and #19412 as an umbrella issue for the general idea of variant types.

I'll note that the vague notion of adding a new syntactical construct and type kind has been suggested a lot of times so far, so your suggestion isn't really novel.

ngortheone commented 7 months ago

so your suggestion isn't really novel

That probably means that the solution space to the sum type problem is small and the search has already exhausted all good options. The main question now is:

Knowing all pros and cons of each solution will golang decide to go for any solution at all?

mikeschinkel commented 7 months ago

I feel like that should have made clear that this was just one example.

Logically-speaking, why? I addressed the one example, and was looking forward to considering others.

As far as I can tell, to name a few others, you have not yet talked about channel-receives, map-accesses, reflect.New, extra capacity allocated by append, the statement var x T when T is a type-parameter (and any other statically disallowed code for these specific types), named returns (in particular in the presence of panic) or clear on a slice.

I would tackle each of these, but from the tone of your arguments I don't feel like continuing what is evidently a contentious debate with you here on this issue.

I'll also note that the suggestion to disallow uninitialized values came up in this discussion before and most of this list has been posted there as well.

I had searched for "disallow" and "initialized" on this page prior to my posting and they appeared nowhere.

I searched again just now, but this time I opened up all the posts marked "off-topic" and found it was you telling @atdiar it wasn't possible, that culminated in his frustrated (your word) post before Robert Griesemer called for respectful discussion.

However, nowhere in that dialog did anyone other than you — i.e. no one from the Go team — argue against the idea.

So my takeaway is that you are asserting that if you already expressed an opinion against something that no one else should be able to discuss it? Just wanting to make sure I understand correctly.

And while I appreciate that it is frustrating to be told that something you see as an easy solution is unworkable,

No, it is absolutely not frustrating to be told something that what I presented as a strawman proposal is unworkable when objective and specific arguments against it are given. That is entirely the point of such a proposal to flesh out its feasibility.

"I'd also ask for a little bit of trust... We wouldn't say that, if we saw a realistic way to make it work. "

What is frustrating instead is to be told, effectively "We have already considered ever conceivable option and so you should just trust me that you have no value to offer here."

when people like Ian or I say things like "Zero values are built into the language too deeply", it's not just an off-the-cuff remark.

As George Bernard Shaw said "The single biggest problem in communication is the illusion that it has taken place." You assume the statement "Zero values are built into the language too deeply" are interpreted as you understand the phrase in a binary form exactly as you understand it without recognizing that others don't interpret that statement the same.

From my perspective my proposal absolutely respected that statement; why else would I have included the concept of zero to be applied to sum types if I was disrespecting Ian's comment? I was explicitly trying to address how useful sum types and zero values could coexist.

BTW, I really like how Ian engages in discussions on this forum. He always replies in a respectful manner, makes a statement when he needs to, but evidently doesn't feel the need to debate everyone who has a proposal, even if it is not one they will pursue. The Go team then ultimately makes their decisions and we all move forward. His approach makes everyone feel as if they can contribute, but is tactful when a discussion gets out of control and reigns it in with a statement of intent. It would be a lot nicer in these forums if everyone were able to participate without any self-appointed gatekeepers.

Merovius commented 7 months ago

@mikeschinkel FWIW there is also #19412, which is more general, so contains more discussions of broader proposals than this one. Searching that casually brings up more discussion about these specific problems, involving a lot more people than me, including people on the Go team.

So, apologies for writing "this discussion". It was imprecise. The general discourse about variants has been going on for a while and I don't always remember where all parts of it happened.

thepudds commented 7 months ago

Just to briefly underscore one point @Merovius has made a few times, I thought it might be helpful to re-post this snippet of Ian's original proposal text (from top comment above):

In discussion here, please focus on the benefits and costs of this specific proposal. Discussion of sum types in general, or different proposals for sum types, should remain on #19412 or newer variants such as #54685. Thanks.

(And of course, given how bad GitHub issues are for long conversations, it's worth keeping in mind the benefits of scaling with many conversations in places outside of GitHub issues, such as the #generics channel of Gopher Slack, which has friendly & thoughtful discussions, or elsewhere like golang-nuts, r/golang on Reddit, or by sharing a Gist you wrote in one of those places, etc.).

mikeschinkel commented 7 months ago

@thepudds — Given your comment it is worth noting that while some people may see discussion as being a different proposal, others making suggestions see it as addressing ways to make the original proposal viable.

Also, given the concept of scaling with many conversations, it would be respectful of and incumbent on those who have the time to seek out and follow many different discussions in many different places that not everyone is fully aware of to not seek to tamp down comments by others without at least first linking to their specific points from those other discussions, and especially before admonishing people for discussing things "that have already be addressed and resolved," but elsewhere. #fwiw

perj commented 6 months ago

I think it would be very helpful if the resolved concrete types were also possible to list using the reflect package. If they are, it would be possible to add functionality to json.Unmarshal to write to these interfaces.

That is, this example would work, if json.Unmarshal would get the types [int, string] from the passed pointer and try to decode each in turn.

func main() {
    var v interface{ string | int }
    err := json.Unmarshal([]byte(`42`), &v)
    fmt.Println(v, err)
}

would print 42, and v would have the underlying type int. For var v any the underlying type would be float64.

I'm don't think the json package functionality would have to be part of this exact proposal. There would have to be decisions made about error handling, for example. But I would expect some support in the reflect package. Presumably reflect.Type.Implements will also need this list, regardless.

// SumTypes returns the concrete types implementing t.
// It panics if t is not an interface type.
// It returns nil if t does not have any type constraints set.
// The returned types are sorted in lexicographic order, including package path.
func SumTypes(t Type) []Type

With that documentation, I suppose methods would also be checked, otherwise Implements(t) might still return false on some of the returned types.

mateusz834 commented 3 months ago

Also see https://github.com/golang/go/issues/68710#issuecomment-2265850918

gonzojive commented 2 weeks ago

In particular, the zero value of an interface type with an embedded union would be nil, just as for any interface type. So this is a form of sum type in which there is always another possible option, namely nil. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.

Is there any active proposal about nil safety? If so, how might it interact with the concern raised above?

atdiar commented 2 weeks ago

@gonzojive not yet to my knowledge.

That would require to think about the usual type constructors in terms of accepting zeroable arguments or not. nil being the zero of interface types it would become a compile time error to create slices of such union/sum types for instance.

Easily resolved with a notation such as A | nil.

In fact, everything can be done. The question is whether that would be a good use of the complexity budget.

Even, type assertions could be handled properly. In

w, ok:= v.(A)

where A would not have untyped nil in its typeset meaning nil wouldn't be assignable, we could still decide that such interface zero value is nil. Just assignment of nil would not be possible.

So w, ok = v.(A) might be forbidden unless w is a newly unassigned variable declared as var w A.

Then it's about what to do in the branch if !ok{...} w is zeroed here and thus cannot be passed as argument where a A is expected until after it has been assigned with a value(which we know has to be non-nil). This require some form of typestate analysis.

Overall nil is not an issue. The issue is nil in interface values. Because interfaces don't differentiate between value and pointer types when we call a method and because checking an interface value for the nilness of its content is not very practical yet.

Hindsight is always 20/20 unfortunately, I wish == applied to interface content including when it's a nil comparison and that we had another way to check for an interface being empty (whether ok:= a.(nil) or the shorter a! as in if err! {...}.

That would get rid of the faq entry on nil comparison, perhaps shorten error handling a tiny bit.

But anyway, this is all related. That's why I still like this proposal because it seems that it's just a piece of the overall puzzle.

golang / go

proposal: spec: sum types based on general interfaces #57644