golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.84k stars 17.51k forks source link

spec: add untyped builtin zero #61372

Closed rsc closed 11 months ago

rsc commented 1 year ago

I propose to add a new predeclared identifier zero that is an untyped zero value. While nil is an untyped zero value restricted to chan/func/interface/map/slice/pointer types, zero would be an untyped zero value with no such restrictions.

The specific rules for zero mimic nil with fewer restrictions:

That's it. That's all the rules.

Note that assignability includes function arguments and return values: f(zero) and return zero, err are valid.

See CL 509995 for exact spec changes.


This proposal addresses at least three important needs:

  1. Referring to a zero value in generic code. Today people suggest *new(T), which I find embarrasingly clunky to explain to new users. This comes up fairly often, and we need something cleaner.

  2. Comparing to a zero value in generic code, even for non-comparable type parameters. This comes up less often, but it did just come up in cmp.Or (#60204).

  3. Shortening error returns: return zero, err is nicer than return time.Time{}, err.

More generally, the zero value is an important concept in Go that some types currently have no name for. Now they would: zero.

Because zero is not valid anywhere 0, "", or nil are valid, there will be no confusion about which to use.


I'm not claiming any originality in this proposal. Others have certainly suggested variants in the past, in quite long discussions. I'm not aware of any precise statement of the exact rules above, but I won't be surprised if one exists.

A brief comparison with earlier proposals:

thepudds commented 1 year ago

Hi all, this conversation has moved fast and at some points it has started to loop a bit.

I would encourage people to read prior comments before posting. That isn't always easy, including because of the 'hidden items' GitHub wormhole created by the many comments on this issue

Two good starting points are this comment above from Russ from July 15 and this one from July 16. In both cases, Russ gave a set of replies to various things raised in the initial round of feedback.

Another good starting point is that you can defeat the GitHub 'hidden items' wormhole by doing a cntrl-f on this page for hidden items then click Load more... (which you currently need to repeat ~2 times to see all the comments), and when that is done, do another cntrl-f search for rsc commented to see the all the replies from Russ. (Because most of those are replies to concerns or feedback, the comments by Russ will help orient you to many of concerns already raised by others, and of course let you see the responses from the proposal author).

DeedleFake commented 1 year ago

As an alternative to the backwards incompatible #60786, what if zero got a new feature: Comparison to zero of a pointer-ish type is true if the pointer is nil or if the pointer points to a zero-value, maybe recursively. In other words, new(T) == zero would be true. This would also, therefore, allow defining comparison of an interface to be true if it has a non-nil type but a zero value value, so things like the following would work:

func F() *int {
  return nil
}

func main() {
  fmt.Println(any(F()) == nil) // Still false.
  fmt.Println(any(F()) == zero) // Would be true.
}

Edit: Upon further thought, this would break one of the intended use cases of being able to compare generics to zero. Darn. Never mind.

Edit 2: As long as I'm at it, an extension of the above rules to slice and maps would mean that len(s) == 0 and s == zero would be equivalent.

gregwebs commented 1 year ago

It is impossible to actually have a discussion on github issues (We can't possibly read all the duplicated comments- leading to more duplicated comments). Can the discussion be moved to Github discussions?

gnojus commented 1 year ago

the zero value is an important concept in Go that programs currently have no name for

I like that the zero value in Go today is more of a concept, and that Go programs have no name for it. We all know that the zero value is 0, "", false, etc. If we add zero, now we can refer to the same things in two ways. Not only would this probably increase bikeshed across different coding styles (0 vs zero, etc) and require extra thought when writing Go code, this may confuse users (especially new to Go) - they may think that 0 and zero are different things.

DmitriyMV commented 1 year ago

If we add zero, now we can refer to the same things in two ways.

We don't create slices like slc := []string(nil), while it is possible. I don't think that much will change here too.

gnojus commented 1 year ago

We don't create slices like slc := []string(nil), while it is possible.

sometimes we do. However, I think most people would agree that var s []string is usually the right choice, especially if we are declaring only one variable.

However, at least for me, with zero it's much more hard to decide when to use what. For example f(false) vs f(zero) f(x, false, "") vs f(x, zero, zero) if s == "" { vs if s == zero { MyStruct{f: false} vs MyStruct{f: zero} f(MyStruct{}) vs f(zero) return 0, nil vs return zero, zero

thepudds commented 1 year ago

Hi @gnojus

However, at least for me, with zero it's much more hard to decide when to use what.

As I understand it, if this proposal is accepted then there will be clear statements about what is considered idiomatic. Cherry picking one of your examples:

return 0, nil vs return zero, zero

Russ wrote in the opening comment:

As far as idiomatic discussion and naming, I expect that zero will only be used for these general uses and will not displace nil as a more specific kind of zero value. In particular, we will keep using terms like nil pointer and nil interface; we will not switch to saying zero pointer, zero interface, and so on.

If that ends up being what is considered idiomatic, then you would not replace return nil with return zero. In addition, very likely some of the common linters like staticcheck or golangci-lint would help in pointing out non-idiomatic code in most common cases. And if either staticcheck or maybe less likely vet start doing that, then it means it will be pointed out usually while you type if you are using one of the various editors using gopls (which I think is the majority of gophers at this point).

DmitriyMV commented 1 year ago

@gnojus

You can already write if len(s) == 0 instead of if s == "" and MyStruct{} instead of MyStruct{f: false}. It's all about the intent - your code shows what you wanted to convey, not just a syntactically correct set of instructions for the compiler.

bokunodev commented 1 year ago

the new zero will hurt readability imo. any example above is pretty short. think about large functions. you have function signature at the top and x == zero or var x T = zero or return zero at the bottom. so, what is zero at that line? scroll back to the top? what about shadows?

if most ppl in favor of it. being zero is shorter, should we discouraged the use of MyType{}? how much typing does it saves you? what if MyType contains non-comparable type (func) ? return zero, T{}, zero, X{}? it introduce inconsistency.

rsc commented 1 year ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

hherman1 commented 1 year ago

I’ve read portions of this discussion, but it is quite long, so there’s a good chance my comment has already been made. Sorry, if so.

I’m very fond of the name zero, and it gives me a bit of regret that nil exists at all, despite the mistake catching benefits.

my main concern is that there is a lot of overlap between where zero can be used and nil can be used, and I think people will use them inconsistently.

One way we might avoid this is with a new constraint on zero, it can only be used on non pointer types and type parameters. So it is invalid to say var x map[a]b x = zero

then zero and nil are totally distinguished in their intended usages.

Put differently, zero may be used only in places where nil may not.

If this is not acceptable, I hope we come up with someway of discouraging inconsistent use of zero and nil.

Merovius commented 1 year ago

Two good entry points into the discussion are this comment from July 15 and that comment from July 16. In both cases, Russ gave a set of replies to various things raised in the initial round of feedback. In fact, @rsc might want to edit the top-post to include a section at the end with some of these points - or even just links to those comments - for easier reference for new people.

@hherman1 your suggestion indeed was made and Russ addresses it in the last bullet point of the latter comment.

rsc commented 1 year ago

I tried writing rules where zero is only allowed for things that don't have a literal zero already - arrays, structs, and type parameters that can be arrays or structs (including type parameter 'any').

The diffs for that version are at https://go-review.googlesource.com/c/go/+/509995/6/doc/go_spec.html.

earthboundkid commented 1 year ago

An advantage of doing it that way is you could start with allowing zero only in limited circumstances and loosen it later if it seems like it’s not a problem in practice or if it turns out not being able to “degenericize” by copy-pasting is a pain.

Lercher commented 1 year ago

As a native German speaker, I find it a bit strange to read return zero, err as an English sentence (generously overlooking the comma), roughly meaning „the count of errors returned is zero“ when the statement idiomatically states the opposite. nil or _ IMO represents more the concept of „I have nothing to return“ and is less 0-ish.

earthboundkid commented 1 year ago

“Return nil err” has the same natural language problem as “return zero err”.

rsc commented 1 year ago

I can't see return zero, err as any stranger than return 0, err.

geraldss commented 1 year ago

I tried writing rules where zero is only allowed for things that don't have a literal zero already - arrays, structs, and type parameters that can be arrays or structs (including type parameter 'any').

The diffs for that version are at https://go-review.googlesource.com/c/go/+/509995/6/doc/go_spec.html.

Universality is simplicity, so I prefer universal zero and would reiterate your initial spec in the spirit of my proposal: https://github.com/golang/go/issues/35966

adnang commented 1 year ago

C# also implemented this a few major versions ago via the default keyword - default could be more indicative of an empty struct/reference than zero since zero is semantically a noun for a value of numeric types

DeedleFake commented 1 year ago

default is already a keyword in Go, not a predeclared identifier. Changing it to work like this would be backwards compatible, but it would make default work differently from every other thing in the language as no other keywords can be used as values like that.

willfaught commented 1 year ago

@willfaught replied about my "mistakes" being valid programs if we make nil be a universal zero, which is exactly my point. Being valid programs does not preclude them from being "mistakes" (i.e., not what the user intended).

@rsc Let's look at those mistakes in detail:

Using zero instead of nil will not allow us to diagnose these "problems" because zero could simply be used in its stead, with all the same "problems" you pointed out.

Google once had a significant data loss (covered by our redundant systems) because someone wrote something like (abstracting a bit) 'if(deleteEverything)' instead of 'if(*deleteEverything)' in a C++ program, where deleteEverything was a pointer-to-bool and the pointer was non-NULL but false, while the code interpreted it as true. "If you don't want the zero value for int then don't write nil" amounts to "don't make mistakes". Mistakes happen (at least in my experience), and type systems are useful for finding them. Expanding nil to scalar types would stop catching real mistakes, because they would no longer be type errors. (Thanks to @zephyrtronium for replying to that effect as well.)

I think I already addressed this argument with pointing out that Go is not weakly typed. This class of problem was likely to happen in C because integers and pointers are truthy in C. C has a bad type system. Go does not. Let's not let our Go design decisions be guided by flashbacks to language features that aren't in Go.

(Edited)

rsc commented 1 year ago

@willfaught, I don't see how it is a flashback. I was pointing out a problem that happened with confusing whether a value was a scalar or a pointer. This absolutely happens. When it happens in Go programs today, the type system catches it. I might have a value f.foo that I think is a pointer but is actually an integer, and if I write 'f.foo == nil' today that's a compile error, which is helpful because it points out my confusion. If we make nil a universal zero value, then 'f.foo == nil' effectively silently rewrites to 'f.foo == 0'. Maybe that's what I meant, but probably not. I would rather the compiler tell me. Confusing scalars and pointers is a common mistake.

Another reason not to go down this road of universal nil is that many programmers come from languages without pointer types at all, in which every value can be nil/null/None/etc. in addition to its actual values. If you are used to writing in that language you could well be additionally confused.

Ultimately we can disagree on this, but I believe all the Go language designers feel strongly about not making it easier to confuse pointers and scalars. A universal zero named nil is off the table.

rsc commented 1 year ago

To recap, 'zero' is no longer a universal zero for all types. Instead, 'zero' is a zero for types without some other short way to spell zero:

In practice this means zero can be used with arrays, structs, and type parameters that include arrays or structs or combinations of other types that have different zeros (for example T interface{string|int} can use zero with type T).

There is no longer any question about which zero form to prefer between 0, "", nil, and zero, because only one of those is allowed in any given context.

I believe this confusion about when not to use zero was the main objection. Now the answer is: use it when you can, don't use it when you can't.

Are there any remaining objections that I've missed? Thanks.

willfaught commented 1 year ago

@willfaught, I don't see how it is a flashback. I was pointing out a problem that happened with confusing whether a value was a scalar or a pointer. This absolutely happens. When it happens in Go programs today, the type system catches it. I might have a value f.foo that I think is a pointer but is actually an integer, and if I write 'f.foo == nil' today that's a compile error, which is helpful because it points out my confusion. If we make nil a universal zero value, then 'f.foo == nil' effectively silently rewrites to 'f.foo == 0'. Maybe that's what I meant, but probably not. I would rather the compiler tell me. Confusing scalars and pointers is a common mistake.

@rsc I think I already addressed this argument with this point:

This same argument could be made against how "x == nil" and "x() == nil" work currently, where x is func() error.

Another reason not to go down this road of universal nil is that many programmers come from languages without pointer types at all, in which every value can be nil/null/None/etc. in addition to its actual values. If you are used to writing in that language you could well be additionally confused.

I would argue that people should just learn the language. There is no null in the JavaScript sense in Go. Education is the answer here. If we bend to trends in other languages, we will just wind up with those languages.

To recap, 'zero' is no longer a universal zero for all types. Instead, 'zero' is a zero for types without some other short way to spell zero:

At first I thought you meant that the proposal had been changed to this a while ago, but I think you meant that you're going to make that change. That does resolve my concern about generality. 👍

ydnar commented 1 year ago

Perhaps it’s worth giving #12854 another look?

They also simplify code that returns a zero-valued struct and an error:

return time.Time{}, err
return {}, err // untyped composite literal

Edit: by this I mean using {} as the zero-value for structs and arrays, which could potentially be relaxed into something like the proposal in #12854.

jimmyfrasche commented 1 year ago

Always allowing zero makes generated code using zero values simpler. The first time I ran into this was with generated code long before generics. It would be a shame if code generators still needed to use var zero1 X to avoid having to figure out which zero to write for X. If generated code wants to know if something is nonzero it would be much simpler to write == zero than to have to generate different code for comparable/incomparable types.

If {} is allowed for arrays/structs in the future zero would still need to be allowed for backwards compat even though there's now a "better" zero value.

I certainly agree that you should not use zero when there's a better option, but it's very simple for a linter to enforce this.

earthboundkid commented 1 year ago

@willfaught I understand that you want an expanded nil, but this issue is about adding zero. Can you think of reasons to avoid adding zero besides that doing something else might be better than doing that? My main objections to zero are (as stated above behind the LOAD MORE black hole) that it’s a new predeclared identifier and that it can be used in confusing ways where nil is more appropriate (but the changed rules by Russ would prevent that). Any more objections we should know?

willfaught commented 1 year ago

@carlmjohnson Perhaps it wasn't clear, but

That does resolve my concern about generality. 👍

meant that I have no further objections.

rsc commented 1 year ago

@jimmyfrasche What is an example of a generator that is writing out code that it doesn't know the type of?

jimmyfrasche commented 1 year ago

It was a generator similar to stringer in that it added boilerplate methods to a type specified on the command line, but the methods could fail so the generated code had

var zero T
// ...
if err != nil {
  return zero, err
}

If zero were allowed anywhere that wouldn't need the declaration.

I do not recall personally writing code generators that could have made use of the other properties of zero but it's simple to extrapolate from the generic use cases to the generated code use case for all of them.

geraldss commented 1 year ago

If zero will not be universal, I agree with @ydnar that {} is a more intuitive spelling of the zero value for structs and arrays. It's already used in empty construction.

Merovius commented 1 year ago

I think {} is not an intuitive spelling for the zero value of a type parameter though. It also requires parenthesis when used in a conditional, to disambiguate parsing, i.e. if x == ({}) { /* do thing */ }. That seems very unfortunate.

geraldss commented 1 year ago

I think {} is not an intuitive spelling for the zero value of a type parameter though. It also requires parenthesis when used in a conditional, to disambiguate parsing, i.e. if x == ({}) { /* do thing */ }. That seems very unfortunate.

Fair points. I would vote for universal zero and allow idioms to evolve. Some overlap is not disqualifying. There's some overlap between any and interface{}, between if-else and switch, etc.

gazerro commented 1 year ago

@Merovius I think there's no ambiguity in parsing if x == {, if {} were a valid value, as the next character could only be } to make the code valid. The ambiguity arises currently with if x == T{} {} that must be written as if x == (T{}) {}.

Merovius commented 1 year ago

@gazerro I think you are correct, yes. It does appear unambiguous, but it's definitely a strange piece of grammar. And I'm not sure the same can be said everywhere. In every place of the grammar that currently has a PrimaryExpr, we'd need to accept a {} as well - or it would have to be parenthesized. But I yield that I don't have a clear example off the top of my head.

gazerro commented 1 year ago

@Merovius I agree with you, taken by itself, {} it seems the most obvious solution to represent the zero value of a struct, but in the context of if x == {} { the code appears rather strange to me.

jarrodhroberson commented 1 year ago

zero semantically means one thing very important already a quantity of nothing. "" as a "zero value" for string has always pssed the #ActuallyAutistic me off.

But I get it, a shorter name for "uninitialized default value" is desperately needed, but calling it zero is not the right one.UDV or any other original Gopher source manufactured term would be better than making zero a keyword and it not actually represent a numerical count of nothing.

Go so wants to NOT have the concept of null even though it absolutely does, and just calls it nil and only when referring to pointer types, should never have allowed uninitialized values; period.

Personally I NEVER use uninitialized primitives with their default values, and I only use nil because it is idiomatic and doing something else in the idiomatic cases would violate the Principle of Least Astonishment.

geraldss commented 1 year ago

@rsc: Do you envision this working:

type Node[T any] struct {
        val T;
}

func (n *Node[T]) Reset() {
        n.val = zero;
}
gophun commented 1 year ago

@jarrodhroberson

But I get it, a shorter name for "uninitialized default value" is desperately needed, but calling it zero is not the right one.

"Zero value" is an established term in Go land for 14 years now, that's what it's called in the spec and in every documentation written during those years. Every Go programmer knows the term, so calling it zero would be absolutely the right thing to do.

earthboundkid commented 1 year ago

should never have allowed uninitialized values

Go doesn't have uninitialized values in the sense that C does.

Anyway, we can't change how that works without doing Go 2, which is out of scope here.

Merovius commented 1 year ago

@geraldss Yes. In that code, n.val has type parameter type, so assigning zero to that is pretty much one of the main points.

geraldss commented 1 year ago

@Merovius That example has no restriction on the type. It can be any Go type. That's the focus of my question.

Merovius commented 1 year ago

@geraldss I don't understand what that changes. On the contrary - that is even more what this proposal is about. If it was constrained on, say constraints.Integer, it might conceivably be invalid as you could use 0. But if it can be any type, you need a word for the zero value that doesn't exist yet.

geraldss commented 1 year ago

@Merovius Yes, that's my point. I want to understand if @rsc envisions zero as universal or not, and if not, I want to understand if he envisions that example working or not.

Merovius commented 1 year ago

@geraldss I answered your question. It's also answered directly in this comment by @rsc:

zero is assignable to any variable of any type T that does not already have a short zero (0, "", nil), including when T is a type parameter with constraint any.

So, again, yes, your example is intended to work. Making it work is the point.

geraldss commented 1 year ago

Alright. If zero is universal in this context, I find the other restrictions on zero to be somewhat superfluous.

fzipp commented 1 year ago

If zero is universal in this context, I find the other restrictions on zero to be somewhat superfluous.

They avoid additional linters, style guide items and discussions.

jarrodhroberson commented 1 year ago

@jarrodhroberson so calling it zero would be absolutely the right thing to do.

even if it does not represent "" or 0 or 0.0 or whatever the "zero value" of the type is?

zero is not the same thing as the term "zero value" ...

one bad decision in naming does not justify more, it could and should be called something more semantically rich and importantly CORRECT. because "" is not 0.

rsc commented 1 year ago

@jarrodhroberson

Go so wants to NOT have the concept of null even though it absolutely does, and just calls it nil

Null and nil are just two names for the same concept. Some languages use one spelling, some use the other. Go happens to use nil. We have never said we don't have that concept.

one bad decision in naming does not justify more, it could and should be called something more semantically rich and importantly CORRECT. because "" is not 0.

Thanks, you've made your point. Many of us respectfully disagree.

atdiar commented 1 year ago

@jarrodhroberson If it helps, the way to see it is that in Go, unassigned variables of some given types are auto-initialized. Some aren't e.g. maps, pointers and slices.

The default value of any variable is called the zero value.

Can see it as the "zero-assignment" value for a variable.

zero supersedes nil as it doesn't care about the actual type, initialization state or anything.