golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
120.42k stars 17.28k forks source link

proposal: spec: generics: type switch on parametric types #45380

Open rogpeppe opened 3 years ago

rogpeppe commented 3 years ago

In the discussion for https://github.com/golang/go/issues/45346, this comment mentioned the possibility of adding a type switch on parametric types. Given that it's threatening to derail the discussion there, I'm creating this proposal to talk about it.

Proposal: type switching on type parameters

I propose that a new statement be added to the language which allows a function with type parameters to further constrain its type parameters:

switch type T {
case A1:
case A2, A3:
   ...
}

The switched type (T above) must be the name of a type parameter. The type in a case arm can be any type, including constraint types.

The first case is chosen that has a constraint that matches the type parameter.

Within a chosen arm of the case statement, the type parameter that's being switched on becomes further restricted by the type selected, just as if it had originally been constrained by that type too. Precisely, given a switched type T constrained by C, and a type in the switch arm A, within the code of the switch arm, T takes on the type:

    interface{
        C
        A
    }

If there are multiple cases mentioned in the arm (A1, A2, ... An), then T is constrained by using a union element:

    interface{
        C
        A1 | A2 | ... | An
    }

Example

type Stringish interface {
    string | fmt.Stringer
}

func Concat[S Stringish](x []S) string {
    switch type S {
    case string:
        // S is constrained by interface {Stringish; string} which is the same as
        // interface{string} which is the same as string, so we can use x
        // as a normal []string slice.
        return strings.Join(x, "")
    case fmt.Stringer:
        // S is constrained by interface {Stringish; Stringer}
        // which is the same as `Stringer`, so we can call 
        // its `String` method but we can't use x directly as
        // []fmt.Stringer because it might have any layout
        // or size.
        var buf strings.Builder
        for _, s := range x {
             buf.WriteString(s.String())
        }
        return buf.String()
    }
 }

The above assumes that the constraint:

interface {
     X | Y | Z
     X
}

would allow all operations allowed on X, but I believe that's true under #45346 proposal.

Note: I don't think there's any need to allow ~ to be used directly in these type switch cases - case interface {~T}: is sufficient if necessary.

rogpeppe commented 3 years ago

Replying to @candlerb https://github.com/golang/go/issues/45346#issuecomment-813000803

func Sort[T Lessable[T]](list []T) {
  var less func(a, b T) bool
  switch type T {
  case Ordered:
      less = func(a, b T) bool { return a < b }
  case Lesser:
      less = func(a, b T) bool { return a.Less(b) }
  }

  // sort using less
}

I can't see how the two branches of the "switch" can be compiled in the same function instance. For a given T, only one branch or the other is semantically valid.

Isn't this very similar to what happens with a switch x := x.(type) statement currently?

Let's assume for the sake of argument that two versions of Sort with different type parameters can be compiled into the same code (it might do this when both types have the same GC layout).

Then the type switch statement could look at the metadata for the type and "unlock" other operations on the type, just as a type switch does for interface values in current Go - there's no particular reason that Sort would need to be split into two versions just because of the type switch. Splitting is of course still a valid implementation strategy and would result in more efficient code, at the usual cost of potential code bloat.

Merovius commented 3 years ago

Note: I don't think there's any need to allow ~ to be used directly in these type switch cases - case interface {~T}: is sufficient if necessary.

I assume you would apply the same to union-elements?

FWIW, while it doesn't add expressive power, I'd prefer if we'd allow both approximation- and union-elements directly, if we can make it work without syntactical ambiguities. It's more convenient and IMO still clear. And I would prefer if the cases can more directly reflect what's in the constraint, that is, I think

type Constraint interface {
    ~int | ~int8 | ~string
}

func ThisSyntax[T Constraint]() {
    switch type T {
    case ~int | ~int8:
        // …
    case ~string:
        // …
    }
}

func IsClearerThanThisSyntax[T Constraint]() {
    switch type T {
    case interface{~int | ~int8 }:
        // …
    case interface{ ~string }:
        // …
    }
}

But it's a weakly held opinion.

I think this would be good, but one concern I have is that I think it would have unfortunate overlap with simply allowing approximation-elements in type switches and assertions (let's call it "approximate type switch"). I think the form of type switch you propose here (let's call it "parameter type switch") is clearly better for type parameters. But the approximate type switch is clearly better for "sum-types", as you'd need to unpack the value at some point - so you'd want to type-switch on a value, not a type.

So, IMO

  1. It is slightly unfortunate to have both
  2. If we never add "sum-types", the parameter type switch would be better
  3. If we do add "sum-types" and only want one, the approximate type switch would be better

Of course, we can always eat the cost of "slightly unfortunate" and add both, when the time comes.

Merovius commented 3 years ago

Isn't this very similar to what happens with a switch x := x.(type) statement currently?

One significant difference is that switch x := x.(type) visibly declares a new variable (shadowing the old one), scoped to the switch case block.

Arguably, the parameter type switch could "declare a new type" (shadowing the old one), scoped to the switch case block. But it's still arguably more implicit and "magic".

rogpeppe commented 3 years ago

I think this would be good, but one concern I have is that I think it would have unfortunate overlap with simply allowing approximation-elements in type switches and assertions (let's call it "approximate type switch"). I think the form of type switch you propose here (let's call it "parameter type switch") is clearly better for type parameters. But the approximate type switch is clearly better for "sum-types", as you'd need to unpack the value at some point - so you'd want to type-switch on a value, not atype.

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal. In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value, but that's not the case here, where once we know about the type, we know the type of all values that use that type.

That's why I chose to restrict this proposal to allow only exactly type parameter names to be specified in the switch, because then there's no ambiguity - we're further constraining the already-known type and thus we know that all arguments, variables and return parameters that use that type are constrained likewise.

rogpeppe commented 3 years ago

To try to be clearer, this proposed statement is a "switch on a type" rather than a "switch on the type of a value". That's why it's very different from the current type switch statement, and also why it's specifically useful in the context of parametric types.

Merovius commented 3 years ago

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal.

I think they are different, but I don't think they are orthogonal/independent. In particular, the "approximate type switch" would in terms of what you can express subsume the "parameter type switch" (or at least most of it). That is, you can write code doing the same, with the same static guarantees, using either - even if less convenient.

In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value

That is true. But this proposal doesn't talk about regular interface values, it talks about type parameters. So, if you are in a situation where you can use the "parameter type switch", you do know that two values have the same type. That is, you could write

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return math.Max(a, b.(~float64))
    default:
        if a > b {
            return a
        }
        return b
    }
}

and this is statically known to be correct - even if inconvenient, by requiring an extra type-assertion, you know this type-assertion can't fail.

That's why I chose to restrict this proposal to allow only exactly type parameter names to be specified in the switch, because then there's no ambiguity - we're further constraining the already-known type and thus we know that all arguments, variables and return parameters that use that type are constrained likewise.

I don't see how case ~int supposedly adds ambiguity, where interface{ ~int } doesn't. One is simply syntactic sugar for the other. Sorry, I think I misunderstood what you where saying, disregard this.

To try to be clearer, this proposed statement is a "switch on a type" rather than a "switch on the type of a value". That's why it's very different from the current type switch statement, and also why it's specifically useful in the context of parametric types.

Yupp :) That's what I meant when I said this is clearly better for type parameters :)

rogpeppe commented 3 years ago

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal.

I think they are different, but I don't think they are orthogonal/independent. In particular, the "approximate type switch" would in terms of what you can express subsume the "parameter type switch" (or at least most of it). That is, you can write code doing the same, with the same static guarantees, using either - even if less convenient.

In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value

That is true. But this proposal doesn't talk about regular interface values, it talks about type parameters. So, if you are in a situation where you can use the "parameter type switch", you do know that two values have the same type. That is, you could write

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return math.Max(a, b.(~float64))
    default:
        if a > b {
            return a
        }
        return b
    }
}

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion, and neither could the float64 returned by math.Max be assigned to the return parameter.

You'd need something like this instead:

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return interface{}(math.Max(float64(a), float64(interface{}(b).(~float64)))).(T)
    default:
        if a > b {
            return a
        }
        return b
    }
}

That would become less verbose if value type assertions could be done directly on non-interface values, but still not great.

and this is statically known to be correct - even if inconvenient, by requiring an extra type-assertion, you know this type-assertion can't fail.

Although as a programmer, you might know that those dynamic type assertions are OK, they seem problematic to me - they're verbose and easy to get wrong (with a nasty panic if you do).

Merovius commented 3 years ago

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion

Who knows, we are talking about a speculative design with no proposal filed :) IMO, x.(~float64) should evaluate to float64 exactly and if ~float64 is used as a case, a should have type float64. But either way, that doesn't matter a lot.

neither could the float64 returned by math.Max be assigned to the return parameter.

Probably true. This is harder to handwave away :)

Although as a programmer, you might know that those dynamic type assertions are OK, they seem problematic to me - they're verbose and easy to get wrong (with a nasty panic if you do).

That's why I said this proposal is better, as long a we're only concerned with type parameters.

zephyrtronium commented 3 years ago

As I see it, this proposal is to add a way to say, "given a thing that might be one of possibly infinitely many types, add special handling for cases where it is one of a chosen set of types." This is already exactly what the current syntax for type switches does, for where "thing" means "value" rather than "type." Considering how similar they are, why is the proposed syntax so different?

In particular, with the current generics proposal, it would be possible to implement parsing with the only new AST nodes being those needed to describe constraints (type lists for the proposal as accepted, or type set operations for #45346, both only on interface declarations). Depending on how the implementation of type parameters is handled, static analysis of type-checked programs could be done with only the same new node; tools that only analyze function bodies might not need to be updated at all. How is the cost of an entirely new syntactic construct to every Go code parser and to every static analysis tool using them justified?

I would prefer if the spelling of these type switches were also switch T.(type).

thepudds commented 3 years ago

I would prefer if the spelling of these type switches were also switch T.(type).

One other option — in an earlier conversation (still in the context of type lists), Ian had floated this syntax:

func F[T constraints.Integer]() { 
    switch T { 
    case int: 
    case int8: 
    } 
} 

The proposed semantics at that time though were not as nice as the semantics proposed here under type sets, I think.

zephyrtronium commented 3 years ago

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion, and neither could the float64 returned by math.Max be assigned to the return parameter.

You'd need something like this instead:

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return interface{}(math.Max(float64(a), float64(interface{}(b).(~float64)))).(T)
    default:
        if a > b {
            return a
        }
        return b
    }
}

That would become less verbose if value type assertions could be done directly on non-interface values, but still not great.

I agree with @Merovius that I would expect case ~float64: to cause T to become float64 within the case, but my reason is specifically that, as proposed in #45346, ~float64 is not a type. There are no values of type ~float64, only of types whose underlying type is float64. Aside from your comment, none of the suggestions in this thread seem to suggest that behavior should change.

(I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other, so it should be fine under even the most conservative type checking of this proposal to return T(math.Max(float64(a), float64(b))).)

urandom commented 3 years ago

I would think that if a case were to include an approximation type (e.g. ~float64), then the compiler should be able to deduce that within the branch T would be a type whose underlying type is float64, and should thus be able to convert a float64 back to T. If that is really the case, then restricting such a switch to only non-approximation types sounds like an unnecessary restriction.

Merovius commented 3 years ago

I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other

Yes, but to allow it, the compiler would have to know about it. It is easy to use v.(~float64) as a float64, because neither v itself, nor it's type, actually change - you look at a new variable with a new type. Meanwhile, if you'd want to convert T(v) after that, the compiler would have to take into account that the type-assertion before succeeded. That's not how Go works so far - for example, you can't do

var r io.Reader = new(bytes.Buffer)
_ = r.(*bytes.Buffer) // succeeds
var br *bytes.Buffer = (*bytes.Buffer)(r) // type-assertion succeded, so we know that we can convert r to *bytes.Buffer

It's a little different, because we are talking type parameters, but it still destroys the elegant simplicity.

So, I do agree with @rogpeppe that type switching on the type parameter is useful and enables you to do new things, because there is inherent logic in saying "inside the switch case, T is interpreted as being constrained to the type set in the case".

I still don't think they are orthogonal though. They still both have significant overlap.

Merovius commented 3 years ago

@zephyrtronium

This is already exactly what the current syntax for type switches does, for where "thing" means "value" rather than "type." Considering how similar they are, why is the proposed syntax so different?

I believe it is because of what you mention - we aren't switching on a value, but on a type. But FTR, I don't think it really matters for the substance of the proposal whether it's switch T, switch type T or switch T.(type) - so maybe we shouldn't worry about the color of our bikeshed just yet :)

Merovius commented 3 years ago

FWIW, after the discussion so far, I've come around to this. I think it's a good addition :)

I still would like to allow union- and approximation elements directly (without wrapping them in interface) though and I also think that this shouldn't prevent us from adding type switches/assertions using approximation elements either :)

rogpeppe commented 3 years ago

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion

Who knows, we are talking about a speculative design with no proposal filed :) IMO, x.(~float64) should evaluate to float64 exactly and if ~float64 is used as a case, a should have type float64.

FWIW the semantics I'd expect from an approximate type switch on a value would be that the value inside the case would allow just the same operations as if the type had been constrained by the switched type in a generic function parameter.

That is, in this example, case ~float64 would allow only operations allowed by every type with underlying type float64, and since you can't assign a value with a defined float64 type to a float64 without an explicit type conversion you wouldn't be able to call Max, hence the need for an explicit type conversion in my code snippet.

Note that being able to assert on ~float64 doesn't help when you're wanting to operate on a slice or other higher level type though, because the constraint syntax doesn't allow interface{[]~float64} AFAICS.

However, switching on the type parameter itself does potentially help that case, because then the slice can be used as an argument to any function with suitable constraints.

For example:

func MaxVec[F comparable](xs []F) F {
    switch F {
    case interface{~float64 | ~float32}:
        // This is OK because F is here constrained by interface{comparable; ~float64 | ~float32}.
        return MaxFloats(xs)
    default:
        ...
    }
}

func MaxFloatVec[F interface {~float64 | ~float32}](xs []F} F {
    ...
}

This makes me realise an interesting point here: even if one has an approximate type switch, unless you significantly add to the power of the approximation and union element features (a big ask, given their current elegance), you will still not be able to do something like assert on the underlying type of a type component such a slice element.

That is, something like this would probably not be allowed, which arguably makes approximate type switches on values considerably less interesting as a candidate feature for the language.

func X(x interface{}) {
    switch x := x.(type) {
    case []~float64:
        ...
    }
}
rogpeppe commented 3 years ago

@zephyrtronium The concrete syntax I've proposed here is only one of many possible. A syntax like switch T { would work just as well; in fact I think I might prefer that, although it arguably gives less clues to the reader of the code about what's going on.

rogpeppe commented 3 years ago

I still would like to allow union- and approximation elements directly

Note that you could add union elements for completeness, but they're technically redundant with respect to this proposal because a case with multiple comma-separated elements is the same as a union element.

Come to think of that: you can use multiple elements in a type switch case currently, but the result is the original interface type. That behaviour probably can't change for backward compatibility reasons, but if one allowed a union element directly, one could constrain the available operations to the intersection of operations available on all the selected types, just as with generics.

That doesn't affect this proposal though.

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

Merovius commented 3 years ago

FWIW the semantics I'd expect from an approximate type switch on a value would be that the value inside the case would allow just the same operations as if the type had been constrained by the switched type in a generic function parameter.

That would require ~float64 to be a type though. That is, with v := v.(~float64), v needs to have some type. IMO float64 is the most natural and most useful type here. Why would I not want it to be a float64? Except avoiding having to learn "v.(~T) evaluates to a T"?

Note that this is completely different when we are talking about this proposal. When you change the constraints put on T inside the case block, you can obviously do more things with it, because you can have many values of that type and be sure they are the same type.

Agreed, to the rest of the comment.

That is, something like this would probably not be allowed, which arguably makes approximate type switches on values considerably less interesting as a candidate feature for the language.

FWIW, I think approximate type switches will be a must if we ever allow to use all interfaces as types. Up until then, they are a nice-to-have, at best (if something like this proposal gets accepted - I do strongly feel that we need some way to specialize on the type argument eventually).

Note that you could add union elements for completeness, but they're technically redundant with respect to this proposal because a case with multiple comma-separated elements is the same as a union element.

This is a drawback to me. Personally, I think I would consider to disallow multiple cases in this type switch construct. It seems we need to choose whether we'd rather be consistent with other switch constructs or be consistent with the constraint syntax. Unfortunate.

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

Can you give an example? I'm not sure what would make it harder. Conceptually, each case block is just an anonymous generic function with a more constrained type parameter. ISTM that if we can type-check a generic function, we should already be able to type-check this construct as well?

zephyrtronium commented 3 years ago

@Merovius

I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other

Yes, but to allow it, the compiler would have to know about it. It is easy to use v.(~float64) as a float64, because neither v itself, nor it's type, actually change - you look at a new variable with a new type. Meanwhile, if you'd want to convert T(v) after that, the compiler would have to take into account that the type-assertion before succeeded. That's not how Go works so far - for example, you can't do

var r io.Reader = new(bytes.Buffer)
_ = r.(*bytes.Buffer) // succeeds
var br *bytes.Buffer = (*bytes.Buffer)(r) // type-assertion succeded, so we know that we can convert r to *bytes.Buffer

Getting a bit off-topic over an example, but I don't follow your argument here. You can do this:

var r io.Reader = new(bytes.Buffer)
switch r := r.(type) {
case *bytes.Buffer:
    var br *bytes.Buffer = r
}

which much closer resembles the example under discussion. The question is the operations (specifically conversions) legal on a parameterized type within a case of a type switch that effectively redefines the type parameter's constraint.

Now, there is a bit of a difference here in that I use r := r.(type) whereas the switch on a parameterized type does not. You can't assign var br *bytes.Buffer = r without shadowing, of course, because the r remains type io.Reader. However, the difference is that r is a value with an interface type, whereas T in the original example is a constraint – not even a type. Per the proposal, within the switch case, the latter is defined to operate as if the function's constraint type set were originally the intersection of the actual constraint and the constraint used in the case. The only types in that intersection are convertible to float64 and float64 is convertible to any type in it, so by the behavior specified in the accepted proposal, conversions between those types are allowed.

Perhaps this does relate to @rogpeppe's comment while I was typing this:

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

And, perhaps this line leads me to conclude that switch T.(type) is not quite the correct syntax, either, because this sort of type constraint switch effectively includes a redeclaration, which arguably should not be implicit.

zephyrtronium commented 3 years ago

@rogpeppe

@zephyrtronium The concrete syntax I've proposed here is only one of many possible. A syntax like switch T { would work just as well; in fact I think I might prefer that, although it arguably gives less clues to the reader of the code about what's going on.

Noted. To me, there is a major difference between proposing an extension of semantics for existing syntax to analogous behavior, and proposing new syntax when an analogous one already exists. But perhaps I am a bit too early to this complaint.

randall77 commented 3 years ago

I can definitely see some trickiness with this idea.

func f[T any](a T) {
    switch T {
        case float64:
            var x T // x is a float64
            x = a // both are type "T", why can't I assign them?
    }
}

I think it is cleaner if you introduce a new type-parameter-like-identifier, like:

func f[T any](a T) {
    switch U specializing T { // or some other syntax
        case float64:
            var x U // x is a float64
            x = a // now it is clear this is not allowed, as it is assigning a T to a U.
    }
}
rogpeppe commented 3 years ago

That would require ~float64 to be a type though. That is, with v := v.(~float64), v needs to have some type. IMO float64 is the most natural and most useful type here. Why would I not want it to be a float64?

Because the original type still carries semantic weight and could be useful. A type switch tells you what the dynamic type of the value is; it doesn't convert it to some other type.

For example, I'd expect the following code to print "MyFloat", not "float64":

type MyFloat float64

func P[F any](f F) {
    switch type F {
    case ~float64:
        fmt.Printf("%T\n", f)
    default:
        fmt.Printf("%T\n", f)
    }
}

func main() {
    P(MyFloat(64))
}
rogpeppe commented 3 years ago

I can definitely see some trickiness with this idea.


func f[T any](a T) {
    switch T {
        case float64:
            var x T // x is a float64
            x = a // both are type "T", why can't I assign them?

In this proposal, you can. Within that case, T is known to be exactly float64 as if with a type alias. So both x and a are of both type float64 and T.

The important point is that the specialisation affects all variables in scope that use type T. It's OK to do that because, unlike regular interface values, there's no special GC shape associated with T, so we can specialise without needing to create a new value.

That is, we're not creating a new T scoped to the case - we are genuinely specialising the original T for the extent of that case.

randall77 commented 3 years ago

Ah, ok, so we could think of these switches as happening at the top level. We put them syntactically lower down in the AST just to share all the code outside the switch.

rogpeppe commented 3 years ago

Ah, ok, so we could think of these switches as happening at the top level. We put them syntactically lower down in the AST just to share all the code outside the switch.

Yes, you could look at it that way, although I'm not sure how helpful that is.

The way I look at it is that within the case statement, you get a more precise view of what T happens to be, so you can do more specific operations on values defined in terms of T. If the generated code isn't fully expanded out for every possible type, the switch operation may well involve some runtime cost (not at the top level) to look up the relevant dictionary of available operations, much as the usual type switch does with the method dictionary.

Merovius commented 3 years ago

@zephyrtronium

ISTM that the confusion is that you are talking about this proposal (reasonably so) while I was discussing a different idea - namely allowing type switches/assertions on values to assert on the underlying type. And I was discussing that idea to compare its expressive power to the one proposed by @rogpeppe. Everything you say is true, under this proposal.

zephyrtronium commented 3 years ago

@Merovius

ISTM that the confusion is that you are talking about this proposal (reasonably so) while I was discussing a different idea - namely allowing type switches/assertions on values to assert on the underlying type. And I was discussing that idea to compare its expressive power to the one proposed by @rogpeppe. Everything you say is true, under this proposal.

I completely missed that! My mistake. I've reread those comments now and everything makes much more sense.

Merovius commented 3 years ago

@rogpeppe

Because the original type still carries semantic weight and could be useful.

You can still type assert on a specific type. The dynamic type of v is not lost by type asserting it to a different one - only the new value loses it. That is, if you know the original type and want to use it, you can still write v.(MyFloat) to get a MyFloat. We just add the capability of using an interface value even if you don't know the type (but only know the underlying type).

A type switch tells you what the dynamic type of the value is; it doesn't convert it to some other type.

Potato, tomato. Note that a type-assertion on an interface does convert, though (it converts the dynamic value to the interface type being asserted). And that conversion is often far less trivial than converting to the underlying type.

For example, I'd expect the following code to print "MyFloat", not "float64":

I agree. But (and that is my fault for encroaching on this issue) you are using your proposal, not the idea of purely adding type assertions on approximation elements. I agree that for a type switch on a type parameter, the parameter should get refined constraints and not be implicitly assumed to be float64 (or whatever). But that's separate from the question of what type v.(~float64) should have where we to introduce that (which is not this proposal).

So the analogous question would be "what does fmt.Printf("%T", v.(~float64)) print, if v contains a type MyFloat float64"? Personally, I find the answer "it should print MyFloat reasonable, but ultimately pointless for little benefit and making the design more complicated than needed. "v.(~float64) evaluates to a float64" is IMO easy to understand, easy to implement and easy to explain. But if the above prints MyFloat, it requires a bunch of extra changes to the language - we need to introduce a new type ~float64, which behaves like an interface (in that it carries a dynamic value), but also allows most operations of float64, but is neither float64 nor MyFloat (as MyFloat isn't even known, statically).

Again, to be clear: With a type switch on a type parameter these questions are easily answered, because var v T still has a specific, concrete and known (theoretically) static type and we are not actually changing that, we just assert that the operations on it are allowed.

Anyway, mayhaps it's just wrong to discuss this here :) But I'm resolved to not file any proposals for extending generics myself, until we actually get generics (and if I would, I would start elsewhere) :)

rogpeppe commented 3 years ago

@rogpeppe

Because the original type still carries semantic weight and could be useful.

You can still type assert on a specific type. The dynamic type of v is not lost by type asserting it to a different one - only the new value loses it. That is, if you know the original type and want to use it, you can still write v.(MyFloat) to get a MyFloat. We just add the capability of using an interface value even if you don't know the type (but only know the underlying type).

Fair point.

A type switch tells you what the dynamic type of the value is; it doesn't convert it to some other type.

Potato, tomato. Note that a type-assertion on an interface does convert, though (it converts the dynamic value to the interface type being asserted). And that conversion is often far less trivial than converting to the underlying type.

Yeah, type asserting on an interface type is somewhat different. In fact, it's already a kind of approximate type switch - what you get has the same dynamic type, but a different static type that allows for a different set of operations.

That's very similar to what I'm imagining for the other approximate case situations.

For example, I'd expect the following code to print "MyFloat", not "float64":

I agree. But (and that is my fault for encroaching on this issue) you are using your proposal, not the idea of purely adding type assertions on approximation elements. I agree that for a type switch on a type parameter, the parameter should get refined constraints and not be implicitly assumed to be float64 (or whatever). But that's separate from the question of what type v.(~float64) should have where we to introduce that (which is not this proposal).

So the analogous question would be "what does fmt.Printf("%T", v.(~float64)) print, if v contains a type MyFloat float64"? Personally, I find the answer "it should print MyFloat reasonable, but ultimately pointless for little benefit and making the design more complicated than needed. "v.(~float64) evaluates to a float64" is IMO easy to understand, easy to implement and easy to explain.

In that particular trivial case, perhaps. But ~float64 is an example taken from a much richer set of possibilities. For example, what if the case was comparable, or ~int | ~string or ... ? What type would the value take on then?

When the full generality of constraints are taken into account, I don't think your rule easy to explain, because it seems to me that it would be an additional special case, whereas "keeps its original dynamic type and permits all operations allowed by the constraint" is easy to explain IMHO and requires no special casing.

But if the above prints MyFloat, it requires a bunch of extra changes to the language - we need to introduce a new type ~float64, which behaves like an interface (in that it carries a dynamic value), but also allows most operations of float64, but is neither float64 nor MyFloat (as MyFloat isn't even known, statically).

Yup, that's what I believe approximate cases in type switches would entail. It seems like quite a hard problem to me and I'm not convinced it would be worth it. If you didn't implement this, then you can find out something about the value, but you can't actually do much with the value without reflection. I can't see that being worthwhile tbh.

Merovius commented 3 years ago

For example, what if the case was comparable, or ~int | ~string or ... ? What type would the value take on then?

These are not approximation elements. They are out of scope for what I was suggesting. Again, the suggestion I was talking about (which is what I was comparing your proposal to in terms of expressive power) is

  1. It is allowed to write a type-assertion v.(~T) using approximation elements (given that T is its own underlying type). Such a type assertion checks if vs dynamic type has underlying type T and if so, evaluates to vs dynamic value converted to T.
  2. Similar for ,ok and type switches.

This is enough to de-structure "sum types" if they are built based on #45346 - the value might have type interface{ ~string | ~int } or comparable, but you would still type switch with separate ~string and ~int (and/or string and int) cases.

It would also give a way to specialize generic variables, via switch interface{}(v).(type) - in that sense it has overlap with this proposal. It would be a bit less powerful for that usecase (it wouldn't allow the Max function from above, for the reasons discussed), but it would have applications in other usecases, not addressed by this proposal (in particular use cases that don't involve generics).

The reason I brought it up here is that a) I think we'd definitely want something like this if we ever allow to use constraint interfaces as types, because we need to be able to de-structure them, even in the absence of generics and b) given the overlap, it might be confusing to have both. But, ultimately, as I said I don't think that needs to keep us from doing this and eventually both, if it comes to it. So it really is a minor-ish concern.

If you didn't implement this, then you can find out something about the value, but you can't actually do much with the value without reflection. I can't see that being worthwhile tbh.

This seems a very strange turnaround to me :) I was suggesting for v.(~T) to evaluate to a T. You can do anything to that which you can do with a T, without any reflection. Which, I think, is generally more than you can do to a hypothetical ~T value.

rogpeppe commented 3 years ago

For example, what if the case was comparable, or ~int | ~string or ... ? What type would the value take on then?

These are not approximation elements. They are out of scope for what I was suggesting. Again, the suggestion I was talking about (which is what I was comparing your proposal to in terms of expressive power) is

  1. It is allowed to write a type-assertion v.(~T) using approximation elements (given that T is its own underlying type). Such a type assertion checks if vs dynamic type has underlying type T and if so, evaluates to vs dynamic value converted to T.
  2. Similar for ,ok and type switches.

OK, fair enough - I'd been erroneously thinking of all the new constraint semantics as "approximation elements" and that's not accurate.

This is enough to de-structure "sum types" if they are built based on #45346 - the value might have type interface{ ~string | ~int } or comparable, but you would still type switch with separate ~string and ~int (and/or string and int) cases.

It was this kind of consideration that led me to suggest the limitations that I mentioned here, namely restricting these "sum type" values to those that can be easily destructured without the need to add support for approximation elements to type switches.

If you didn't implement this, then you can find out something about the value, but you can't actually do much with the value without reflection. I can't see that being worthwhile tbh.

This seems a very strange turnaround to me :) I was suggesting for v.(~T) to evaluate to a T. You can do anything to that which you can do with a T, without any reflection. Which, I think, is generally more than you can do to a hypothetical ~T value.

As before, I was mis-thinking of "approximate matching" as "constraint matching", sorry. With approximate matching (only), your suggestion would work OK. I'm not sure how useful it would be in practice though. Have you got some use cases in mind?

ohir commented 3 years ago

I ponder about switching over signature types.

switch type T {
case func(int) int:
case func(byte) byte:
   ...
}
Merovius commented 3 years ago

It was this kind of consideration that led me to suggest the limitations that I mentioned here, namely restricting these "sum type" values to those that can be easily destructured without the need to add support for approximation elements to type switches.

Yes. I drafted a response to that and discarded it in favor to the "this is off-topic" comment :)

IMO, the goal should be to remove the distinction between "constraint interfaces" and "type interfaces". Replacing one restriction with another doesn't achieve that. I would even argue that the distinction becomes harder to understand, relying on finite type sets - I don't think it's super obvious what that means to many people.

And given that I see an IMO clear and simple way out of having that distinction while allowing de-structuring of arbitrary type sets - namely to allow type switches/assertions to use approximation elements - I don't see the point in keeping it :)

Have you got some use cases in mind?

The ones I can think of mainly revolve around de/serialization. They still need reflection for composite types though, if you want to support that. It's not an incredibly strong case TBQH - I'm mainly thinking about the eventuality of removing the constraint/type interface dichotomy.

ianlancetaylor commented 3 years ago

I'm troubled by cases like FromStrings2 in the section https://go.googlesource.com/proposal/+/refs/heads/master/design/43651-type-parameters.md#pointer-method-example of the proposal. In that function, if I know T, I also know PT. But if I do a type switch on T as suggested here, in a specific case, T will take on a specific value, but PT will not. I'm concerned that that could be confusing.

urandom commented 3 years ago

@ianlancetaylor As a user and writer of FromStrings2, I would see that the switch is over type T and not PT. If I were to also want PT to take a specific value, I'd also create another switch within whatever case I was currently in to do that. Since I wrote FromStrings, I already know that PT exists, so that in itself is not a problem. All of it seems rather explicit, but certainly not confusing.

rogpeppe commented 3 years ago

I'm troubled by cases like FromStrings2 in the section https://go.googlesource.com/proposal/+/refs/heads/master/design/43651-type-parameters.md#pointer-method-example of the proposal. In that function, if I know T, I also know PT. But if I do a type switch on T as suggested here, in a specific case, T will take on a specific value, but PT will not. I'm concerned that that could be confusing.

In that particular case I think it works out OK. Let's work it through:

type Setter2[B any] interface {
    Set(string)
    *B
}

type M struct {
    f string
}

func (m *M) Set(s string) {
    m.f = s
}

func FromStrings2[T any, PT Setter2[T]](s []string) []T {
    switch type T {
    case M:
         // XXX
    }
}

At the point XXX, the operations available are as if FromStrings2 had been declared as:

func FromStrings2[T interface {any; M}, PT Setter2[T])(s []string) []T

The constraint interface{any; M} is equivalent to interface{M} which is equivalent to M itself. So Setter2[T] is now equivalent to Setter2[M], which is equivalent to interface {Set(string); *M} which is equivalent to *M, so we do know the exact type of PT.

That seems good to me.

It doesn't work out the other way though. If we switch type of PT, then theoretically we could know the type of T, but because T isn't defined in terms of PT, the type of T remains any.

I think that all seems reasonably intuitive, but I might be too close to the subject here.

thepudds commented 3 years ago

It doesn't work out the other way though. If we switch type of PT, then theoretically we could know the type of T, but because T isn't defined in terms of PT, the type of T remains any.

If the last scenario that @rogpeppe just sketched ends up happening, some related questions would be:

If for example the type of T remains any after the type switch, but the author was expecting it to have some type M and writes some code that relies on that expectation, it seems the author of the generic code would be the one to trigger a compiler error, and not some later innocent consumer?

And it seems that compiler error has some decent chance of being clear, e.g., improper operation being used against a variable, where the type parameter is reported within the error? (e.g., some error like foo.WrongMethod undefined (type bound for T has no method WrongMethod), which is a current error from the current prototype, or maybe something in the future that is more specific and mentions any)?

Whether or not the a solution is obvious or reasonably guessable based on that error might then depend on the complexity of the code at the point in time the error is introduced... but it seems the generics author would at least have a decent chance of being drawn back to the type switch in question if the author's intent or assumption was that the type switch would have changed the type reported in error message? But it less obvious if a solution would be obvious to an author at that point (though FWIW, adding a second type switch is the solution that popped to mind based on Ian's initial comment above, but that of course was in the context of a small example and having been told what the problem was).

In any event, I think those are useful questions even if I don't know the answers 😅 (and sorry if I have not followed the problem being highlighted).

rogpeppe commented 3 years ago

I would even argue that the distinction becomes harder to understand, relying on finite type sets - I don't think it's super obvious what that means to many people.

"Finite type set" in that context is more-or-less "no approximation elements" (†). I think that might be fairly easy to explain.

Anyway this is still off topic and we should probably stop discussing it here :)

† Obviously it's a little bit more involved than that, because interface{~int; int} does contain an approximation element, but would still be allowed because the approximation element is removed by set intersection).

rogpeppe commented 3 years ago

It doesn't work out the other way though. If we switch type of PT, then theoretically we could know the type of T, but because T isn't defined in terms of PT, the type of T remains any.

If the last scenario that @rogpeppe just sketched ends up happening, some related questions would be:

  • who would see any related error?
  • would any related errors be cryptic or clear?
  • would one or more solutions be obvious or reasonably guessable based on the errors and the code?

The error would be a compile-time error on the generic code, not the code that's using the generic code. It might be something like: type T with constraint any cannot be assigned to variable of type M.

It's not entirely dissimilar to the kind of error that you get if you inappropriately try to use a value that's been type-asserted by a switch v.(type) outside of the switch.

It seems to me like things should be OK in this respect.

urandom commented 3 years ago

@rogpeppe

Inside case M, if the compiler automatically knows the type of PT as *M and allows any value of PT to be used directly as *M, this might just make it the first real implicit behavior in the language (though I could be wrong). There might be nothing wrong with that, but it's something to consider.

I guess one could always go with a more verbose approach, like:


switch type (T, PT) {
case (M, *M):
}

but it may make things more cumbersome as well.

rogpeppe commented 3 years ago

@rogpeppe

Inside case M, if the compiler automatically knows the type of PT as *M and allows any value of PT to be used directly as *M, this might just make it the first real implicit behavior in the language (though I could be wrong). There might be nothing wrong with that, but it's something to consider.

The main "implicit" things here are the implications of the rules started in the proposal - the type switch is explicit and the rule that T (and hence every type expression involving T and every variable defined in terms of T) is further constrained by the type in the switch case, is also explicit.

So I'd say this is more a consequence of the explicit rules rather than implicit behaviour.

Note that we are already using these rules in order to be able to use arguments, variables and return parameters as the newly constrained T - the PT case is just another example.

BTW technically it's not that the compiler knows that PT is *M: more accurately PT is now a constraint with a type set that contains exactly one type, and since that type is *M, by the rules outlined in #45346 all operations allowed on *M are allowed on PT.

Which amounts to exactly the same thing in the end, but is maybe worth pointing out.

bcmills commented 3 years ago

There is a bit of awkwardness here w.r.t. “is” vs. “implements”. Normally when a value appears in the case of a switch x, statement the switched value must be exactly that value, but when an interface type appears in the case of a switch x.(type), the value can be of any type that implements that interface.

The distinction matters here because I can't intuitively tell whether case fmt.Stringer in switch type S should match any S that implements fmt.Stringer, or only S = fmt.Stringer exactly. That distinction matters when we're considering things like function signatures, or element-types of slices or maps.

rogpeppe commented 3 years ago

I can't intuitively tell whether case fmt.Stringer in switch type S should match any S that implements fmt.Stringer,

The rule here is that it's exactly as if you'd used fmt.Stringer as a type constraint so, as with constraints on type parameters, there would be (I believe) no way to require exactly an interface type as opposed to anything that implements that type. I like the uniformity of that approach (and it's easy to explain), but it could be argued that it is too restrictive.

One variant I considered but rejected as too hard to explain is to allow switching on any type defined in terms one or more type parameters. The type in the switch would be unified with the type in the case, allowing multiple types to be specialised at once if desired.

In that case, switch *T would enable finding the exact type of T because *io.Reader doesn't match any other type.

Merovius commented 3 years ago

I like the uniformity of that approach (and it's easy to explain)

FWIW, for consistency with a) normal type switches, b) using other type sets in cases in a type parameter switch and c) using type sets in a constraint interface (as you mentioned), I really think it has to be this interpretation. Anything else is too inconsistent and confusing.

it could be argued that it is too restrictive.

If people really need to check if a type parameter is a given interface, they can do so via

if interface{}(new(T)) == new(fmt.Stringer) {
    // T is known to be exactly fmt.Stringer
}

They would then have to refer to fmt.Stringer in the block directly, converting/type asserting/… usages of identifiers outside the block using T as appropriate. Which they can, because a) they know the type of the interface they are interested in (it's not possible to use an unknown type instead of fmt.Stringer, as it's not possible to restrict type-parameters to be interface types) and b) they know the actual identity, not just membership in a type set.

It's not very convenient code to write and it can't happen as a case of a switch (but would have to happen in a separate statement before/after/in a different case). But it should be very rare to need this, so it seems fine to me for it to be inconvenient.

bcmills commented 3 years ago

The rule here is that it's exactly as if you'd used fmt.Stringer as a type constraint …. I like the uniformity of that approach (and it's easy to explain),

I don't agree that that approach is uniform, though — it is only uniform with respect to type parameters, not with respect to type-switches.

bcmills commented 3 years ago

FWIW, I think you can write the Concat example almost as clearly using only ordinary type-switches and assertions:

type Stringish interface {
    string | fmt.Stringer
}

func Concat[S Stringish](x []S) string {
    switch x := x.(type) {
    case []string:
        return strings.Join(x, "")
    default:
        // S is not string, so it must be fmt.Stringer.
        var buf strings.Builder
        for _, s := range x {
            buf.WriteString(s.(fmt.Stringer).String())
        }
        return buf.String()
    }
}

The type-assertion to fmt.Stringer would presumably be straightforward for the compiler to optimize: it already knows the type set of S, and it's not difficult to extrapolate from a type switch on []S to type information about S itself.

bcmills commented 3 years ago

@Merovius

If people really need to check if a type parameter is a given interface,

To me the interesting question is not “do they need to check if the type parameter is a given interface?”, but rather “in what ways do they need to contort the code to express what they want to express?” Knowing that a type parameter is exactly a specific interface, for example, doesn't make the code particularly less verbose if the author still needs to write a bunch of type-assertions, or (worse!) boilerplate conversions.

The really interesting wrinkle with the Concat example, I think, is how it interacts with https://golang.org/doc/faq#convert_slice_of_interface and https://golang.org/doc/faq#convert_slice_with_same_underlying_type. If you only know that S implements fmt.Stringer, you can't use that information in the same way that you can use the information that S is string. The “is string” information allows you to convert and/or type-assert to a concrete type, whereas the “implements fmt.Stringer” information does not.

rogpeppe commented 3 years ago

I don't agree that that approach is uniform, though — it is only uniform with respect to type parameters, not with respect to type-switches.

Actually, it is uniform with respect to type switches, at least somewhat: if I have an interface type in a regular type switch case, I'm asking for anything that matches the interface type, not anything that's exactly that type, which is like the behaviour proposed here.

The difference here in this proposal is that it can very much matter whether a type is exactly something or not, because that influences compatibility with the outside world (the type switch might say that R matches io.Reader, but that doesn't mean I can use io.MultiReader as func(...R)R).

The different semantics are why I've suggested the type keyword in the construct (as opposed to just switch). Maybe some other syntax might help to make the intended semantics clearer?

FWIW, I think you can write the Concat example almost as clearly using only ordinary type-switches and assertions:

Yes, that's not a great example for showing the benefits of this construct (it just happened to be current in the original thread). Your example isn't quite allowed though, because you can't directly type assert on s currently, so you'd need:

        for _, s := range x {
            buf.WriteString(interface{}(s).(fmt.Stringer).String())
        }

The Max example is better than the Concat example, and better still might be something that showed a bunch of disparate parameters and local variables all unified by the type switch.

There is a technique that does allow using a single regular type switch and no dynamic type assertions even in the Max example, BTW, but it's clunky! (and it might not be allowed by the current spec; I can't remember)

func Max[T ordered](a, b T) (r T) {
    type args[S any] struct {
        a S
        b S
        r *S
    }
    aa := args[T]{
        a: a,
        b: b,
        r: &r,
    }
    switch aa := interface{}(aa).(type) {
    case args[float64]:
        *aa.r = math.Max(aa.a, aa.b)
        return
    }
    if a > b {
        return a
    }
    return b
}
Merovius commented 3 years ago

@bcmills

it is only uniform with respect to type parameters, not with respect to type-switches.

Can you explain that? You mentioned yourself, that in a type-switch we check for implementation of an interface, not identity to an interface type. Isn't that exactly what @rogpeppe is describing?

FWIW, I think you can write the Concat example almost as clearly using only ordinary type-switches and assertions:

It stops working once you use ~string in Stringish. There's also this Max example from a different context, which IMO demonstrates an edge-case. While a normal type assertion/switch can get you from a T to a float64 (or whatever), it can't take you back from a float64 to a T. For that, the compiler actually needs knowledge about T.

The “is string” information allows you to convert and/or type-assert to a concrete type, whereas the “implements fmt.Stringer” information does not.

It does, just only to types you can convert a fmt.Stringer to. That's the difference here - you are comparing "is a string" to "implements fmt.Stringer", but the interesting comparison is to "is fmt.Stringer". That information is far less useful than "is a string".

I agree that there has to be a way to know the actual concrete type for many things. But you can do that. You can use string in a case. I don't buy that there are many use case to know that a type parameter is a specific interface type though - if there was, we should definitely consider changing the generics proposal to be able to express that.