golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.22k stars 17.46k forks source link

proposal: Go 2: universal zero value with type inference #35966

Closed geraldss closed 7 months ago

geraldss commented 4 years ago

I propose a universal zero value with type inference. Currently nil is a zero value with type inference for pointers and built-in reference types. I propose extending this to structs and atomic types, as follows:

{} would represent a zero value when the type can be inferred, e.g. in assignments and function call sites. If I have a function:

func Foo(param SomeLongStructName)

and I wish to invoke Foo with a zero value, I currently have to write:

Foo(SomeLongStructName{})

With this proposal, I could alternatively write:

Foo({})

For assignments currently (not initializations; post-initialization updates):

myvar = SomeLongStructName{}

With this proposal:

myvar = {}

This proposal is analogous to how nil is used for pointers and reference types.

The syntax allows type names and variable types to be modified without inducing extraneous code changes. The syntax also conveys the intent "zero-value" or "default" or "reset", as opposed to the actual contents of the zero value. Thus the intent is more readable.

geraldss commented 4 years ago

Or perhaps underscore as the zero designator. Either would be readable.

jimmyfrasche commented 4 years ago

Related issues #19642 now closed which proposed a universal zero value and #12854 which would allow type names to be elided in composite literals allowing all the examples in the first post.

beoran commented 4 years ago

Interesting idea. Sorry to bike shed, but perhaps the reusing the default keyword would be more readable?

Foo(default) myvar = default

geraldss commented 4 years ago
Foo(default, default, default)
Foo({}, {}, {})
Foo(_, _, _)

I find the latter two more readable, but default or other keyword is also fine with a syntax highlighter. As jimmyfrasche pointed out, the {} and _ syntax have been proposed previously.

Here's another argument for the proposal. These calls highlight the values that are being passed, which is good:

Foo(10, "xyz", nil)
Foo({})

This call highlights the type that is being passed, including its fully qualified name. This shifts the cognitive effort.

Foo(SomeLongStructName{})

quenbyako commented 4 years ago

i think Foo({}, {}, {}) is more readable, than default, FMIO, cause 1) default has a bit more letters... cause, you know, less letters -> better code 🙃 2) create new reserved word is not a good idea, i know so many projects with default variables, so, it can break a lot of code base. Also _ symbol is for another things, like /dev/null in golang universe. so, with _ symbol as empty struct definition is not a good idea, i think.

but as a concept of language proposal, i like your idea.

ianlancetaylor commented 4 years ago

This seems to be a restatement of #19642 with a different spelling of the zero value. Given that the earlier proposal was not accepted, what has changed since then?

geraldss commented 4 years ago

It's not stated why the previous proposal was closed, and I had not seen it when I searched and filed this proposal.

I raised this proposal from direct and repeated experience. In addition to comments in this issue and the previous issues, I'll add another:

There are up to three items of information in a Go expression or assignment: name, type, and value.

  1. Names are inferred using uniform rules across all Go datatypes. That is, names are inferred in assignments, function calls, and return statements, and the inference behavior is uniform across all Go datatypes.

  2. Values are not inferred. This is also uniform.

  3. Type inference is not universal and uniform, and it's not clear to me why that is.

The function calls FooInt32(0), FooInt64(0), FooPtr(nil), FooChan(nil), FooMap(nil) will all infer the argument type correctly. Presumably Golang believes that type inference is beneficial or ergonomical. These could all require explicit typing, e.g. int32(0).

ianlancetaylor commented 4 years ago

0 is an untyped constant, as are "", true, and false, and, for that matter "abc", 100, and 1+2i. Untyped constants may be used with any compatible type. If there is no compatible type, as in a := 0, they have a default type.

nil is the zero value of pointer, slice, channel, map, and function types. It is not an untyped constant: a := nil is an error. nil is in effect an overloaded term for the zero value of certain types. This overloading is problematic; see #22729. Note that for the types with which nil can be used, there is no other way to write the zero value.

This proposal, and #19642, is something else again. It proposes a way of writing a value that can be converted to the zero value in a type context. Writing a := {} would be an error. But we could write F({}), a = {} (for an already defined a), a == {}, return {}. For ordered types we could write a > {}. And while {} could be used with any type, it would always be an alias for the actual zero value of that type (0, false, nil, S{}, etc.).

You could presumably write 0 == {}, which would always be true: the 0 would have no type context so it would default to int, at which point {} would default to the value 0 in type int. Maybe you could write {} == 0. I'm not sure. I'm also not sure about 1 + {} and {} + 1. Or "a" + {} and {} + "a".

So I don't agree with your suggestion that there is some missing aspect to type inference. Untyped constants, nil, and {} are three different kinds of things.

geraldss commented 4 years ago

Per your comment, untyped constants do support type inference, and overloaded nil does support type inference (issue with nil interfaces noted).

The net effect of this is that type inference is neither uniform nor universal across data types. This is the impetus for my proposal and the earlier proposals. I also like #12854, and would consider any of these a positive step.

ianlancetaylor commented 4 years ago

I think we must mean different things by "type inference". I tried to describe exactly how untyped constants and nil behave, to show that they are different from each other. I agree that if you describe both untyped constants and nil as "type inference", then "type inference" is neither uniform nor universal across data types. But I don't see how this proposal changes that fact.

geraldss commented 4 years ago

By type inference, I mean the omission of the type name in the text of the value.

Your example of a == {} is interesting. I write these all the time:

if ptr == nil
if v == 0

Would be useful to write

if v == {}

where type of v is SomeLongStruct.

This proposal says that {} is treated uniformly as the zero value in all contexts where type can be inferred / determined. That seems uniform and universal. The concept of "zero value" is already universal, i.e. defined for all types.

ianlancetaylor commented 4 years ago

OK, omitting the type in the text of the value is what I would call an implicit conversion. Untyped constants support an implicit conversion to a set of related types, and also support an implicit conversion to a default type. The value nil supports an implicit conversion to any pointer, slice, etc., type. This proposal is suggesting that the value {} support an implicit conversion to any type.

Another case where implicit conversion occurs in Go is that any type that implements an interface type may be implicitly converted to that interface type.

deanveloper commented 4 years ago

A better way to do this (in my opinion) would be to allow for constant struct expressions, which would hopefully include "untyped struct literals". #21130 gets close to this but isn't very specific, I might try to type up something a little more formal.

geraldss commented 4 years ago

Const-ness is orthogonal to type inference.

deanveloper commented 4 years ago

Untyped constants are not, however. What I am proposing is that we should be able to do var x MyStruct = {} just as we can do var y time.Duration = 0

jimmyfrasche commented 4 years ago

I think #12854 and #21182 would fill most of the gaps where this hurts in most code. Comparing a struct to its zero would still be a little awkward with this proposal or #12854 since you'd need to write if v == ({}) {.

Generating code or, in the future, writing generic code that uses zero values is still going to be awkward, as you don't know which form the zero value takes, though #21182 would knock out the most painful case.

You can always do var zero T but that gets a little awkward if you need zeros for more than one type in the same scope. You can avoid naming the zeros and use the expression *new(T) but that's a bit weird looking, especially since new isn't used that much.

In most cases, you could probably get away with generalizing and having the user pass in a value, zero or not: for example, writing Filter(type T)(s []T, v T) []T instead of RemoveZeros(type T)(s []T) []T.

In generic code, comparing to zero also has a little wrinkle in that some incomparable types have a special case for comparing against zero that can't be matched in type constraints where you can only specify comparable or not. If there were some universal zero value, then #26842 could be accepted since there would always be a way to write a statically guaranteed to be all-bytes zero. But, if that's the only major case left and it would still be awkward to see if comparable structs are zero, maybe it would suffice to have a predeclared func zero(type T)(T) bool that worked on comparable and incomparable types alike?

geraldss commented 4 years ago

Yes, it's possible to do less. But I haven't seen any argument for why less is more in this case, or any downside to the universal zero.

jimmyfrasche commented 4 years ago

Let's consider what we can do with a specific, typed zero value, var zero T:

  1. Reset a variable to zero: v = zero
  2. Define a new variable: u := zero
  3. Send it to a channel: c <- zero
  4. If T is comparable, compare another variable to it: v == zero
  5. If T has operators, use it as an operand: v < zero or v + zero
  6. Use it in a composite literal: []T{u, zero, v}
  7. Call a method on it: zero.M()
  8. Return it from a function: return zero, err

If we had a universal zero value, then defining a new variable and calling a method are out, as a specific type is required for each. Using it as an operand isn't really a problem since any type with operators already has a concise zero value.

That leaves:

  1. Reset a variable to zero: v = zero
  2. Send it to a channel: c <- zero
  3. If T is comparable, compare another variable to it: v == zero
  4. Use it in a composite literal: []T{u, zero, v}
  5. Return it from a function: return zero, err

For the majority of these, there's only really a problem if T is composite, as they have verbose zero values. Use in a composite literal is only sometimes an issue as the types of composite literals in composite literals can be elided in a number of cases. #12854 could expand elision to all the remaining cases and allow you to write return {}, f({}) for example. This would also allow quite a bit more since you could also write c <- {k: v} or f({}, {X: 1}, {2, 3}).

For comparable T, comparison against zero would still have the issue that we could write

p := v == {}
if p { // ...

but we couldn't write

if v == {} { // ...

due to the ambiguity and we would instead have to write

if v == ({}) { // ...

All of this assumed that we knew upfront what T is. That goes away when generating code or (hopefully soon) writing generic code. Even if every type has a concise zero value, we will not necessarily know which one to use, unless the contract of the type parameter is sufficiently strict.

The most common case would be returning some zero values and an error. #21182 would allow that and also improve the readability and editability of non-generated/generic code as a bonus.

That leaves us with a different set of possible problems:

  1. Reset a variable to zero: v = zero
  2. Send it to a channel: c <- zero
  3. If T is comparable, compare another variable to it: v == zero
  4. If T has operators, use it as an operand: v < zero or v + zero
  5. Use it in a composite literal: []T{u, zero, v}

A universal zero value would be useful here, but I think the majority of these will be relatively uncommon, though I could be wrong. A good way to make a case for this proposal would be to write reasonable generic code using the latest generics draft that is very awkward without a universal zero. Finding code generators that have a lot of special cases or past/known bugs because of this would be another.

The one that seems like it would be most likely to cause problems is the split between incomparable types that are totally incomparable versus those that can be compared against nil. (#26842) If there were a universal zero value, all types, comparable or not, could be compared against it regardless of the specificity of the type constraints. It would also help to avoid the ambiguity when comparing struct values to their zero. But if it's just this one case that's left over that predeclared zero predicate would suffice.

geraldss commented 4 years ago

As a detail, I don't see any ambiguity with

if v == {}

Every binary operator requires expressions on both sides, not statement blocks.

jimmyfrasche commented 4 years ago

That's true. I was thinking about how you have to write if v == (T{}) { but you have to do that because of the T not the {}.

earthboundkid commented 4 years ago
type T = func()

func Default(a, b T) T {
    var zero T
    if a != zero {
        return a
    }
    return b
}

This code doesn't compile because zero is a variable, not a constant, so it gives error "invalid operation: a != zero (func can only be compared to nil)". const zero T doesn't work because "const declaration cannot have type without expression". If default meant "zero value for type", you could write a != default and the code would work.

This doesn't matter much now, but in a world with generics, not being able to write (type T) IsZero(t T) bool would be a pain.

jimmyfrasche commented 4 years ago

@carlmjohnson there's also #26842. Consider type T = struct { f func() }. A universal zero value wouldn't help with that unless it was also allowed to be compared against universally. Another way to solve that problem would be to make a function like IsZero a builtin.

ianlancetaylor commented 4 years ago

Currently the language permits writing a simple expression, without specifying a type, for the zero value of most types: 0 for numeric types, "" for string types, nil for function, pointer, interface, channel, slice, and map types, false for boolean types. The exception is structs and arrays.

The raises the possibility of, rather than inventing a generic zero value, extending nil to be usable with struct and array types. Then nil would be the zero value for any composite type, which could arguably be a simplification of the spec.

The idea here is that we could assign nil to a variable of struct or array type, which would mean to zero all the elements. And we could compare a value of struct or array type to nil, which would report whether the value were the zero value.

geraldss commented 4 years ago

Extending nil to all composites would address the convenience issues.

However, a universal zero would work for all types, and should have additional benefits for tooling, generics, etc. Also, a universal zero always compiles as an assigned value or an argument, hence callers / assigners are protected from type changes to variables they don't care about.

Either option would impose some cognitive change on golang developers, so the question is how desirable is universality / generality. As a reference point, I think the universal underscore serves golang 1.x very well.

jimmyfrasche commented 4 years ago

You could also extend nil to any type and lint against using it with a known type with a "better" zero value like numbers and strings. That would let generic/generated code use nil for the zeroes of unknown types.

geraldss commented 4 years ago

Yes, the spelling of the universal zero is mostly aesthetic, unless we want the zero for structs to coincide with partially valued structs.

griesemer commented 4 years ago

For basic types Go already has special syntax for their values: We can write numbers (incl. 0) for numeric types, strings (incl. "") for string types, true and false for boolean types. With the exception of false, the respective zero values for these types tend to be short (shorter than nil or zero) and evocative.

Thus, at least with existing Go I don't see a good reason for introducing an alternative way of writing those zero values differently. That may change when we have generics where we may want to introduce a zero value that can be written in a type-independent way.

But I do like @ianlancetaylor's idea of generalizing nil to all composite types. It will take a bit of getting used to, but as @bradfitz has pointed out (verbally, during the proposal review mtg), we already use nil as the zero value for a slice, and a slice is basically a struct with three fields (pointer to underlying array, length, and capacity). It's really a small step to generalize this and it would be nice to be able to write in the spec that nil is simply the zero value for all composite types.

This should be a backward-compatible change. In generic code we might go the extra step and permit nil as the zero value for all variables of generic type.

jimmyfrasche commented 4 years ago

Generated code exists today, will exist after generics, and can get annoying when zeroes are involved. You either need to write it in an unnatural manner or figure out which zero value to use based on type analysis, but at least the latter would be reduced to selecting from {0, "", false, nil}. Allowing nil universally would fix that and the corresponding future issues with generic code. I would trust authors to avoid using nil when there was a better candidate just because it's simpler to write and reads better. And it is easily machine-checkable so the occasional slip up would be trivial to lint without issue as generated code is not linted.

Still, just allowing nil for composites would be a very nice improvement in non-generated/non-generic code and would allow addressing #26842 so :+1: even if I think it should go further.

mdempsky commented 4 years ago

Extending nil to be assignable to more types (or even all types) sounds reasonable to me.

fogleman commented 4 years ago

0 for numeric types, "" for string types, nil for function, pointer, interface, channel, slice, and map types, false for boolean types. The exception is structs and arrays.

The fact that these all vary depending on the underlying type suggests to me that structs and arrays deserve their own different zero expression instead of overloading the meaning of nil. Or you could double down and allow any type to be assigned to nil, setting numerics to zero, strings to the empty string, etc.

fogleman commented 4 years ago

as @bradfitz has pointed out (verbally, during the proposal review mtg), we already use nil as the zero value for a slice

Is this a good analogy? In those cases, functions like len and append operate on the nil argument as a special case, it's not really overloading the meaning of nil itself.

geraldss commented 4 years ago

One thing to consider is the ability to take the address of a zero value:

type SomeStruct struct{}

func Called0(s *SomeStruct) {}
func Caller0() { Called0(&SomeStruct{}) }

func Called1(ss **SomeStruct) {}
func Caller1() { Called1(&nil) } // compile error in golang 1.x

func CalledX(s *SomeStruct) {}
func CallerX() { CalledX(&{}) } // can this be defined to work?
leighmcculloch commented 4 years ago

The proposal lists the two problems below as the problems it solves in the issue description, but neither appear to me as compelling problems worth solving and I can't find conversation here where the problems are considered in isolation of a proposal.

Are these problems worth solving?

Are there other problems that this solves that I'm missing?

To focus on each problem:

1 –

The syntax allows type names and variable types to be modified without inducing extraneous code changes.

Modifying code is something code is great at, which is why we have IDEs and tools like gopls. In the examples in this issue where the type name has been removed I find the code ambiguous. I find it very helpful that I can look at a line of code setting a variable with a struct value and see exactly what struct value is being set. I'm struggling to imagine a situation where using {} or zero would increase code clarity.

2 –

The syntax also conveys the intent "zero-value" or "default" or "reset", as opposed to the actual contents of the zero value. Thus the intent is more readable.

I don't think {} or zero suggests any better intent than MyType{}. Both clearly signal an intent to set a zero value. MyType{} cannot mean anything else other than a zero value for that type since no fields are set.

mdempsky commented 4 years ago

@fogleman

In those cases, functions like len and append operate on the nil argument as a special case, it's not really overloading the meaning of nil itself.

len and append don't special case nil. A nil slice has zero length and capacity, so len returns 0, and append has to allocate a new slice when appending a non-zero number of elements to it due to lack of capacity.

@geraldss

One thing to consider is the ability to take the address of a zero value:

We already have new for taking the address of a new zero value. I think it would be a bonus if a universal zero value can be used for composite literal expressions somehow, but it doesn't seem essential to me.

fogleman commented 4 years ago

len and append don't special case nil. A nil slice has zero length and capacity, so len returns 0, and append has to allocate a new slice when appending a non-zero number of elements to it due to lack of capacity.

You're right, my mistake!

Can we learn something from my mistake? (Hopefully I haven't lost all credibility!)

Slices are certainly more opaque than plain structs. As a regular Go user not working on internals, I didn't fully appreciate the fact that x = nil when x is a slice is actually writing zeros to the len and cap fields of what is essentially a struct. (I may have been more likely to see it this way with the simple declaration var x []foo)

adg commented 4 years ago

@ianlancetaylor wrote:

The idea here is that we could assign nil to a variable of struct or array type, which would mean to zero all the elements. And we could compare a value of struct or array type to nil, which would report whether the value were the zero value.

The other place where this would be useful, and probably most useful in my opinion, is in return values. It is common to have a function that returns someStruct and an error, and in all the error cases you must write:

  return someStruct{}, err

We often work around this using named return values, but that is fraught with concerns about variable shadowing.

The proposal to use nil as the zero value for structs would yield this instead:

  return nil, err

which is undoubtedly less verbose and annoying than typing the struct literal.

--

As convenient as this seems, I don't much like it. My rationale:

We use nil as the zero value for types that are references: pointers, channels, maps, slices. We also use nil as the zero value for interface types, which have a reference-like flavor to them (they are still boxes that contain values, and nil means an empty box).

Using nil as a zero value for structs and arrays moves its meaning further away from the "null reference" meaning that it has today. Reading return nil makes me think I'm returning something that shouldn't be used, but often returning a zero valued struct is a perfectly reasonable and safe thing to do.

I would prefer that rather than overloading the meaning of nil we chose some other word, like zero. I acknowledge that as English words they are synonyms, but in the context of Go we give them additional meaning.

alanfo commented 4 years ago

Firstly, I see no real need for having a universal zero value for non-generic code so I am not in favor of this proposal as it stands.

However, I do think it would be a good idea if there were a 'short' default value for structs and arrays which could be used instead of the present verbose syntax, though I agree with @adg that (whilst convenient in some ways) nil would not be a good choice.

I think it is too embedded in people's thinking that nil is the default value solely for reference types and it has been argued elsewhere (see for example #22729) that it already has too many uses and should be split up into several different built-ins, though that's not an argument I agree with on practical grounds.

Also (to me at least) it would seem anomalous if nil were the default value for both a struct and a pointer to a struct.

So I think we need something different and, to derive some benefit from the present proposal, why not just use {} as the default for structs and arrays which is short, natural enough, doesn't need a new built-in and would be backwards compatible?

As far as generics are concerned, I think we really do need a universal zero value for cases where a unique zero value cannot be deduced from the contract (if any). Although it would work, *new(T) is rather ugly and the other alternatives mentioned in the draft design document (including as already discussed nil) all have drawbacks.

Again, I think it would be best if something were used which was unique to generics and my personal favorite would be to overload the default keyword either on its own or perhaps in the form default(T) as is done in C#. I don't think it matters much that this is quite long as most people won't be writing much generic code anyway.

earthboundkid commented 4 years ago

Using nil as the name of the zero value for structs will lead to bugs. If I write func() (t Thing, err error) (oops, should have used a pointer type), the compiler catches my mistake when I write return nil, err. You can multiply examples, but the general idea is that I personally (and I assume others) will often mistakenly write something to take concrete structs when it should really take a pointer to a struct, and I'm glad that the compiler helps me catch my mistake.

I don't see the point of adding a zero that's not universal. Generics will need a universal zero, so you can write:

func Default(type T) (a, b T) T {
    if a != zero { return a }
    return b
}

You can't write that without a universal zero, and adding nil as struct zero doesn't really help, since it won't work for numbers and strings.

earthboundkid commented 4 years ago

I don't know the issue number (edit: #22729 in the discussion above), but this is going in the opposite direction of the proposal to make nil interface less confusing by renaming it to something else (emptyinterface or whatever). I think the vast number of pixels wasted on the difference between nil interface and nil pointer should show that further overloading the name nil is probably a bad idea.

I guess the only way it would be a good idea would be to go the whole way, use nil as the universal name for zero, and then add new names for particular zeros, like null for pointers, empty for interface, etc.

eandre commented 4 years ago

I have noticed that marshaling as JSON a nil slice results in null whereas an empty slice results in []. How does that distinction get recognized if nil is just zeroing the slice struct? How does that relate to Ian's proposal?

earthboundkid commented 4 years ago

@eandre, that's not really related. null for nil slice is a choice the JSON encoder makes. See https://github.com/golang/go/issues/37711 for a proposed change.

eandre commented 4 years ago

@carlmjohnson yes I understand that, but I've had a clear mental model of the distinction between a nil slice and an empty one. If nil were to refer to the zero value for composite types, how is a nil slice different from []Typ{}? The proposal seems to make nil and T{} equivalent for composite types, but if we let T = []Typ what does that mean?

deanveloper commented 4 years ago

I'm generally in support of this proposal, I think. However, I speculate that this may increase confusion with interface{} == nil related issues.

ianlancetaylor commented 4 years ago

@eandre I don't think any proposal changes anything with regard to the distinction between []Typ{} and nil. nil would remain the zero value for a slice. []Typ{} is not the zero value for a slice. It's true that if Typ is a struct, that Typ{} is the zero value for a struct, but I think that just means that structs are different from slices. Maybe I missed it, but I don't think anybody is saying that Typ{} should always be the zero value for Typ. I agree that if someone is saying that, that it won't work.

nemith commented 4 years ago

I don't see the point of adding a zero that's not universal. Generics will need a universal zero

You can just define a new variable which will default to the zero value of that type and use it for comparison as done here https://go-review.googlesource.com/c/go/+/187317/13/src/cmd/go2go/testdata/go2path/src/slices/slices.go2#81

func Default(type T) (a, b T) T {
    var zero T
    if a != zero { return a }
    return b
}

However having a universal untyped zero value may be clearer.

earthboundkid commented 4 years ago

@nemith, that doesn’t work for func(), which can only be compared to a constant nil and can’t be declared as const.

mdempsky commented 4 years ago

@carlmjohnson Also slice and map types.

earthboundkid commented 4 years ago

Yes, #26842 is very relevant. Some otherwise incomparable types are zero-comparable:

  • incomparable, there is no == operator defined for a type (structs and arrays with incomparable fields).
  • 0-comparable, there is an == operator but it can only test against the zero value (funcs, maps, and slices can be compared against nil). The spec treats these as incomparable types and notes the special case.
earthboundkid commented 4 years ago

https://github.com/golang/go/issues/22729 add kind-specific nil predeclared identifier constants https://github.com/golang/go/issues/26842 always permit comparisons against zero value of type

Here is my attempt at a summary of the problems and possible solution paths.

Problems:

  1. Users are often confused by nil pointer vs. nil interface.
  2. There is no "universal zero value" suitable for a text macro or simple lexical substitution in return statements.
  3. There are two sub-kinds of incomparable types: the truly incomparable and the types comparable to nil (func, map, slice). This is not per se a problem, but…
  4. If Go gets generics, it would be impossible to write a Default(type T)(a, b T) T function that works with func/map/slice and strings/numbers/structs.

Possible solutions:

beoran commented 4 years ago

A similar idea as adding new keywords, but to stay backwards compatible, would be a new a new built in function zero(), which could be added, and which returns the "unversal zero" for the variable it is assigned to. Buitin functions are the most go-ish approach to this problem.