golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.91k stars 17.65k forks source link

proposal: spec: extended type inference for make and new #34515

Open DeedleFake opened 5 years ago

DeedleFake commented 5 years ago

Rationale

Currently in Go, type inference works in one direction and one direction only: From the values of an assignment to the variable declarations. In other words, given some expression with a statically-determinable type, which is basically every expression, the type of a declaration can be omitted. This allows for a lot of the perceived lightness of Go over many statically typed languages.

However, there are a number of situations where this is not possible, most of which have to do with make() and new(), both of which are unique, not including some stuff in the unsafe package, in that they take a type name as an argument. Normally this is a non-issue, as that type can be used to determine the return of the expression, thus allowing for inference in the normal manner:

m := make(map[string]interface{})
s := make([]string, 0, 10)
p := new(int)

Sometimes the variable must have a separately declared type, though, such as in the case of struct fields:

type Example struct {
  m map[string]interface{}
}

func NewExample() *Example {
  return &Example{
    m: make(map[string]interface{}),
  }
}

This leads to unwanted repetition of the type name, making later alteration more awkward. In particular, I thought of this while reading through one of the many proposals about ways to make anonymous structs more useful with channels, and I realized that the following pattern could get very annoying, exacerbating the existing issue:

type Example struct {
  c chan struct {
    v string
    r chan<- int
  }
}

func NewExample() *Example {
  return &Example{
    c: make(chan struct {
      v string
      r chan<- int
    },
  }
}

Proposal

I propose the introduction of a new keyword, say typeof, which takes a variable, function name, or a struct field identifier and essentially acts as a stand-in for a standard type name, possibly with the restriction of only being available as the argument to a make() or new() call. For example,

return &Example{
  m: make(typeof Example.m),
}

This would allow a type name to be pegged to an already declared variable type elsewhere.

Alternative

Alternatively, make() and new() could allow for the aforementioned types of identifiers directly, such as

return &Example{
  m: make(Example.m),
}

This has the advantage of being backwards compatible, but is potentially less flexible if one wants to extend the functionality elsewhere later, such as to generics.

ianlancetaylor commented 5 years ago

If we are going to introduce typeof, I can't think of any reason why we would restrict it to being used with make and new. That would not be very orthogonal.

dmkra commented 5 years ago

Type of m in Example is already known. So typeof or Example.m can be omitted: return &Example { m: make() }

DeedleFake commented 5 years ago

That's backwards compatible, too, that way, although it brings with it an implicit assumption that typename.fieldname has a general meaning.

Edit: The comment this was a response to seems to have been deleted. Not entirely sure what happened there. It had a suggestion to use Example.m.(type) instead of typeof Example.m.

bradfitz commented 5 years ago

Type of m in Example is already known. So typeof or Example.m can be omitted: return &Example { m: make() }

That's what I want myself. @griesemer points out that you'd also want to do new() then.

griesemer commented 5 years ago

Regarding typeof: Instead of a new keyword, typeof could easily be a built-in. The possible "danger" of a typeof operator is that people might start using it all over the place, for all kinds of declarations, thereby reducing the readability of the code.

There's clearly a strong sentiment for being able to use make() or new() and have both of them infer the necessary type. Though we don't have anything else in the language that behaves like this at the moment. Is there a better syntax?

bcmills commented 5 years ago

Building on @dmkra's observation, we could use _ as a stand-in for “infer the obvious type”.

type Example struct {
  m map[string]interface{}
}

func NewExample() *Example {
  return &Example{ m: make(_) }
}
type Example struct {
  c chan struct {
    v string
    r chan<- int
  }
}

func NewExample() *Example {
  return &Example{ c: make(_, 1) }
}
bcmills commented 5 years ago

That said, I think #21496 is a better fit for the map example:

return &Example{ m: {} }

And I suspect that #28366 would address most of the realistic use-cases for the chan example:

type Example struct {
  c chan(1) struct {
    v string
    r chan<- int
  }
}

func NewExample() *Example {
  return &Example{}
}
bcmills commented 5 years ago

@DeedleFake, there is some precedent for .(type) in earlier proposals too.

(For example, I used (type) as the “type, not value” indicator in https://github.com/golang/proposal/blob/master/design/15292/2016-09-compile-time-functions.md.)

DeedleFake commented 5 years ago

I like the _ for inference, but it feels a bit like it overlaps with _ as a discard assignment. This would allow it to be used on the right-hand side, technically, though only in the unusual case of make() and new() which take actual type names as arguments.

Maybe _.(type)? Maybe not.

bradfitz commented 5 years ago

Using .(type) for both dynamic type checks (as today with type switches) and static types would slightly be weird.

bcmills commented 5 years ago

@bradfitz, at least there is precedent for that! The cases within a switch statement can already be either dynamic values or static types, depending on whether you're switching on x or x.(type).

ianlancetaylor commented 5 years ago

Using _ looks like we are discarding the value.

Another possibility would be . to refer to the current topic, sort of analogous to text/template.

var m map[int]string m = make(.)

(Using dot might also work for #33359.)

But in general this would be a new kind of idea in Go that we don't currently have.

ianlancetaylor commented 5 years ago

Or we could take one of the ideas from #33359 and use ... here.

    var m  map[int]string
    m = make(...)
jimmyfrasche commented 5 years ago

Does there need to be anything?

Could the absence of a type argument be enough to signal that it should be inferred?

var ch chan T = make() // same as make(chan T)
var buf []T = make(10) // same as make([]T, 10)
x := new() // compile time error, no type can be inferred
DeedleFake commented 5 years ago

Slightly off-topic, but in the case of new(), I'm in favor of getting rid of the function completely and adding make(*T). I don't really see why there needs to be a separate function. With the above suggestion, it'll only work with a pre-existing pointer type variable anyways, which means that it would essential infer T from a *T variable, unlike make() which would just use the variable's type directly.

It's a small nitpick, though.

ianlancetaylor commented 5 years ago

Let's definitely keep the fraught discussion of new vs. make in a different issue.

ianlancetaylor commented 5 years ago

There doesn't need to be an argument to make, but personally I think some sort of marker would be better than having it just make something.

jimmyfrasche commented 5 years ago

It would make it somewhat more similar to the recent contracts draft, modulo the number of parens and commas.

bcmills commented 5 years ago

The ? symbol is also currently unused, right? That seems like a reasonable token to indicate “inferred”.

beoran commented 4 years ago

I think typeof()as a built in function has the best potential here, and would be the most backwards compatible solution, and useful in several other cases, such as in low level programming, in conjunction with unsafe.Sizeof, unsafe.Alignof and unsafe.Offsetof.

If there is any risk that typeof() will end up being overused, then this can be solved through go lint checking. Personally I think the risk for abuse is low, since if you do typeof(x), then x will already be in scope, and the type of x should be relatively clear to the reader.

ianlancetaylor commented 4 years ago

Another idea would be to permit referring to the basic kind of type without filling out the elements. For example,

    var s []byte = make([], 10)
    var m map[int]string = make(map)
    var m2 map[int]string = make(map, 100)
    var c chan bool = make(chan)

This might make the code clearer to the reader, while still permitting the omission of the redundant type information.

dotaheor commented 4 years ago

The new() and make(length, capacity) form looks consistent with this generic proposal.

griesemer commented 4 years ago

Just a few observations:

  1. new and make are special because they do expect a type rather than a value as their first argument. If we permit leaving those types away in calls, we lose the visual cue that these calls are indeed calling the built-in functions new and make, and not some user-defined regular functions. (We don't have that cue either if the type argument is a type name rather than a type literal, but that seems rare.) It may be beneficial for readability to retain some sort of visual signal. On the other hand, we already permit the omission of types in certain nested composite literals, and thus leaving away the type in make and new calls has some (remote) precedent, especially because often these calls do appear when constructing composite literals. Also, overwritten new and make identifiers are rare.
  2. A typeof built-in function can't be used in situations where the variable is both declared and initialized as follows: var v []T = make(typeof(v), n) because the newly declared variable won't be in scope until after the variable declaration (which includes the initialization expression). Of course, in these cases one could just write var v = make([]T, n). But it's not clear that adding a new built-in is making this problem simpler - e.g., it doesn't help much if writing the type is shorter than writing the built-in call.
  3. We could just write [], map and chan, which is simple and provides information about the kind of type that's being initialized. The problem with this approach though is that none of [], map and chan are currently syntactically valid type expressions. We'd have to allow them in general, or at the very least as first arguments to calls of functions called make (the parser cannot know that it is parsing a call to the built-in make function, we need type information for that). Probably doable, but requiring much more work than one might anticipate at first (parser and AST will have to be adjusted).
  4. Using ... to denote the inferred type would be comparatively easy. We already accept ... as expression in array declarations, so no parser or AST changes are needed. Doing type-inference for ... in the type-checker would be relatively straight-forward.

Of these four approaches, it seems that 1) (just leave the type away) and 4) (write ... instead of the type) are the most promising ideas so far.

As nice as ... might seem, it does open a bit of a pandora's box (where else should we allow ...?). It is also not strictly necessary (in contrast to the use of ... in array and parameter declarations or when unpacking a slice).

Thus, notwithstanding some more general concept for ... use, it seems to me that leaving away the type altogether might be the most pragmatic and Go-like approach, should we decide to proceed with this. It does solve the problem, it would be relatively straight-forward to implement, and it would make the code shorter. If readability is problematic, one can always write the type as we do now. And new and make are already very special because they accept a type argument, so this simple form of type inference wouldn't make them much more special.

jimmyfrasche commented 4 years ago

Even if all other proposals using ... are accepted, I think that omitting the type is a better fit. This situation is more analogous to the type inference in the generics draft than it is to any of the other ... proposals. Even if that particular draft doesn't survive, the type inference for function calls likely would. I wouldn't want to have to type f(...)(x, y, z) for every generic function call.

jimmyfrasche commented 4 years ago

There's also a tangential connection to #12854 in that the contexts in which the type could be deduced are the same for both proposals.

DeedleFake commented 4 years ago

@griesemer

A typeof built-in function can't be used in situations where the variable is both declared and initialized as follows: var v []T = make(typeof(v), n) because the newly declared variable won't be in scope until after the variable declaration (which includes the initialization expression). Of course, in these cases one could just write var v = make([]T, n). But it's not clear that adding a new built-in is making this problem simpler - e.g., it doesn't help much if writing the type is shorter than writing the built-in call.

This proposal was specifically not intended for those cases, as those cases already don't require writing the type twice. This proposal was intended to help only in the cases where the declaration and assignment have to be separate, such as in struct fields. I thought that a way to tell the compiler 'use whatever type this variable was declared as here' would allow for a general solution to it, as the problem stems from the type inference currently only being possible do from value to variable and not the other way around.

DeedleFake commented 3 years ago

It was pointed out in episode #166 of Go Time that this type of inference would not help in situations where the user was initializing using a composite literal, such as

type Example struct {
  Values map[string]string
}
// ...
e := Example{
  // Still requires a repeat of the type.
  Values: map[string]string{
    "some": "value",
    "or": "another",
  },
}

That's true. As I said in the original proposal, my primary motivation was channel initialization, though I did think of empty map initialization, too. Slices would be less useful, as appending to a nil slice works just fine. I do still think this would be worth it just for the channel case and the empty map case, though, and, if implemented, it definitely should work for slices too just to be consistent.

DeedleFake commented 2 years ago

I've found a workaround for some situations when using generics. Although the current generics implementation doesn't have inference for return values, it can do it for pointers:

func makeMap[M ~map[K]V, K comparable, V any](m *M, c int) {
  *m = make(M, c)
}

type Example struct {
  Vals map[string]int
}

var ex Example
makeMap(&ex, 0)

I don't know if this has any problematic effects on, for example, optimization, but I recently had a very large number of initialization that I needed to do and each one had a repeated type from a struct definition, so I wrote a function like this and used it instead and it massively cleaned up the code.