golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
121.34k stars 17.38k forks source link

spec: add untyped builtin zero #61372

Closed rsc closed 9 months ago

rsc commented 1 year ago

I propose to add a new predeclared identifier zero that is an untyped zero value. While nil is an untyped zero value restricted to chan/func/interface/map/slice/pointer types, zero would be an untyped zero value with no such restrictions.

The specific rules for zero mimic nil with fewer restrictions:

That's it. That's all the rules.

Note that assignability includes function arguments and return values: f(zero) and return zero, err are valid.

See CL 509995 for exact spec changes.


This proposal addresses at least three important needs:

  1. Referring to a zero value in generic code. Today people suggest *new(T), which I find embarrasingly clunky to explain to new users. This comes up fairly often, and we need something cleaner.

  2. Comparing to a zero value in generic code, even for non-comparable type parameters. This comes up less often, but it did just come up in cmp.Or (#60204).

  3. Shortening error returns: return zero, err is nicer than return time.Time{}, err.

More generally, the zero value is an important concept in Go that some types currently have no name for. Now they would: zero.

Because zero is not valid anywhere 0, "", or nil are valid, there will be no confusion about which to use.


I'm not claiming any originality in this proposal. Others have certainly suggested variants in the past, in quite long discussions. I'm not aware of any precise statement of the exact rules above, but I won't be surprised if one exists.

A brief comparison with earlier proposals:

gopherbot commented 1 year ago

Change https://go.dev/cl/509995 mentions this issue: builtin, spec: add builtin untyped zero

mvdan commented 1 year ago

Assuming that you prefer zero over _ in terms of the syntax for a zero value or predeclared identifier, could you expand on your reasoning?

I personally find zero to be clearer and consistent with nil, although I admit that return _, _, err is shorter and feels nicer than return zero, zero, err. It also mirrors _, _, err := foo(), for example.

mvdan commented 1 year ago

As far as idiomatic discussion and naming, I expect that zero will only be used for these general uses and will not displace nil as a more specific kind of zero value. In particular, we will keep using terms like nil pointer and nil interface; we will not switch to saying zero pointer, zero interface, and so on.

I would really like for the spec and builtin change to include guidance on this. That is, I assume we want "idiomatic Go" to not replace most uses of nil with zero, except perhaps where it helps with consistency, like rewriting return nil, time.Time{}, err into return zero, zero, err.

We really want to discourage rewriting if err == nil into if err == zero, for example. That sort of change would be noisy and make Go code less consistent across codebases, unless everyone does the big rewrite - which seems unlikely.

josharian commented 1 year ago

An alternative is to adding zero is removing the restrictions for nil. Can you share some about the thinking to prefer adding zero?

josharian commented 1 year ago

except perhaps where it helps with consistency, like rewriting return nil, time.Time{}, err into return zero, zero, err

FWIW, I would still prefer https://github.com/golang/go/issues/21182 here (return ..., err). I think that it is my single favorite open proposal. (Although I'm also partial to 128 bit ints. :P)

mrwormhole commented 1 year ago

I am not a super smart guy but, I do think explaining *new(T) = *&struct{} is useful and not awkward at all, I come from strong C background, I think everyone who is new should learn what "" or "&" stands for avoiding incorrect usage, some codebases actually pass type or type without realizing because of this lack of knowledge.

Secondly, I think standard cmp package shouldn't be reasoning behind any change because we as developers now going to bump into same package names with 1.21 (because we use google/go-cmp primarly for our testing and IDEs will show up 2 results now, did you mean this or that etc), I am personally not happy with 1.21's cmp package direction

Thirdly, I think explaining zero as a concept over struct{} will be harder for everyone and in most cases, we never return allocated struct and error at the same time, it should be the dev who is assigning to a default when error occurs, not the other way around. I always never liked returning struct{}, err or nil,nil . Zero will hide the allocation detail for concrete types, I think allocation must be obvious to the reader's eye (new is the exception here)

apologies for the noise from me

earthboundkid commented 1 year ago

I prefer zero to _ because it would be weird to say if f == _.

Merovius commented 1 year ago

FWIW I don't have strong opinions about how to spell a universal zero value - I'm fine with any color for that bikeshed. I do think this proposal (or this proposal with s/zero/_/g) has the advantage of addressing several semi-related issues with a single, easily understood mechanism.

@mrwormhole

Secondly, I think standard cmp package shouldn't be reasoning behind any change […]

The justification isn't the cmp package, it's that we need a mechanism to compare values against their zero value, even if their constraint is not comparable. All types allow doing that, but there currently is no way to write a generic function that does it.

The cmp package is one consumer of such mechanism, but not the only one. Personally, I ran into this with a container type wrapping map, where I would have preferred not to store zero-values, as they take up memory without carrying any semantic benefit for my use case, where "not stored" and "the zero value" where semantically equivalent.

I think it's fair to criticize arguments like the explainability or readability of *new(T) for their subjectivity. But this particular problem has no solution without a language change. And "I don't like the package name cmp" is obviously not a very compelling reason not to add such a mechanism.

jimmyfrasche commented 1 year ago

this proposal would also satisfy #26842.

@rsc would accepting this proposal also involve changing cmp.Or to accept any type and use == zero or would that need to be a separate proposal?

@josharian #21182 could be additionally accepted. It would be less needed than it is now but there's still an argument to its utility and if it were accepted instead of this proposal there'd still be a need for the additional functionality contained in this proposal.

seebs commented 1 year ago

I like this a lot. I also like _ for the universal-zero.

What I almost want is for zero to be a valid value for every non-pointer value, but not for pointers, which need nil. I say "almost" because in some contexts, especially generics, I don't know whether a value happens to be pointer-ish.

What I really want is "zero is a universal zero that you can use except when you know you are thinking about a thing in pointer terms, in which case you want nil". Although thinking about it more, at least two cases where I currently use nil (slices, maps), I think "zero" would be comparably/similarly expressive.

I am now very conflicted on whether I think it's more consistent to call the zero value for a slice zero or nil in such a case. I definitely prefer nil for pointers, though.

seebs commented 1 year ago

Okay but thinking about it more, I have concluded:

I would also be fine with just extending nil to all types, including non-pointer types. I'm not actually going to be particularly confused by seeing return nil, err in a function returning a non-pointer type for any length of time, we already have the word, and it's good at expressing "i don't actually want/need a value here".

Observation: That you can use nil for slices, maps, and interface values, and in each case it means a thing that is more complex than a simple "nil pointer" is sort of an argument that we already effectively do this. We have at least three things which are, internally, actually structs of some kind, for which nil is a valid value.

AlexanderYastrebov commented 1 year ago

Could it be just 0 instead of an identifier? (return time.Time{}, err -> return 0, err)

earthboundkid commented 1 year ago

@seebs I think it would be interesting to allow null for pointers and no other types. But I don’t think that addresses a pressing problem in the same way. It’s more of a “if I wrote Go 2” idea.

zephyrtronium commented 1 year ago

A linter to complain about zero where nil could be used instead seems like it would help preserve existing idioms.

earthboundkid commented 1 year ago

would accepting this proposal also involve changing cmp.Or to accept any type and use == zero or would that need to be a separate proposal?

cmp.Or hasn’t been merged yet (it’s on hold until 1.21 is released), so I think it could just change to T any without a discussion.

AndrewHarrisSPU commented 1 year ago

Could it be just 0 instead of an identifier? (return time.Time{}, err -> return 0, err)

It might be funny business if 1-1 and 0 behaved differently :)

willfaught commented 1 year ago

Why won't case zero work?

What guidance do you propose to give for when 0 vs. zero and nil vs. zero should be used? In other words, what would be idiomatic? For example, for func F() (int, string, time.Time, error), should it be return 0, "", zero, nil or return zero, zero, zero, zero?

I'm concerned by the direction the language is evolving. I see non-orthogonal features like this being added instead of existing features being generalized. There is already a zero value in Go: nil. If you lump all the built-in number types together, most built-in types have a nil value. Numbers, strings, arrays, and structs are the exception. Instead of adding something new to solve this one problem, let's generalize what we already have: make nil work for all built-in types. This has been proposed many times by many people. I'm disappointed that this proposal didn't address why this obvious solution won't work. My vote is no until it's changed to do so.

FWIW I don't have strong opinions about how to spell a universal zero value - I'm fine with any color for that bikeshed.

As an aside, I don't agree that coming up with a good name is bikeshedding. Naming is often characterized as one of the two hard things in computer science. Whether or not you agree with that quotation, naming is indeed important, and entirely relevant to the quality of a software design, as opposed to a nuclear power plant design committee getting derailed by the paint color for a bike shed.

jimmyfrasche commented 1 year ago

I would write return 0, "", zero, nil unless it became idiomatic to do something else

Merovius commented 1 year ago

@willfaught

Why won't case zero work?

Currently, it works because case nil is special cased in a type-switch. If we'd want case zero to work, we'd have to similarly special case it (note that in a type switch, the regular "check for the zero value" logic doesn't work, because the cases are types, not values). There seems to be no compelling reason to do so, especially if we generally advice to use nil, if it works.

AndrewHarrisSPU commented 1 year ago

I feel like zero is a great name because it results in a bit pattern of just zeros following assignment, and zero'd bit patterns are already so fundamentally baked into the language. Occasionally it is just easier to think about the bits.

soypat commented 1 year ago

Some have favored removing the restrictions on nil instead of bringing in the zero to Go. I'd be curious on hearing arguments against it. So far comments in here and on the slack performance channel on it have all evolved from uncomfortable feeling to quiet acceptance.

So often have I refactored return myType{}, err to return nil, err when really there was no semantic difference in my program.

Merovius commented 1 year ago

@willfaught

As an aside, I don't agree that coming up with a good name is bikeshedding. […]

FWIW there is more to the bikeshed analogy than just the importance of the question. But, in any case, I was merely trying to express that it doesn't matter to me. I want something to happen and I find iszero(T), zero, _, nil… all satisfying, personally. I would support any of them. I just strongly oppose doing none of them, because we can't agree on how to spell zero.

seebs commented 1 year ago

So, I think the messiest aspect of this is roughly the behavior in ambiguous contexts. If you're returning an interface value, then return concrete{} and return zero are not the same.

I could see a hypothetical benefit to a zero being distinct from nil if zero were restricted to needing a concrete type. So, it can be type-inferred, but if the inferred type is an interface, that's an error. Thus, you'd have to do something like return concrete(zero) to return an (interface wrapping) a zero-valued concrete type, and nil to return a nil interface.

Right now, if you are actually returning an interface type, and you have a lot of return &concrete{...} in the function, it's easy to miss that return nil is not the same thing as return (*concrete)(nil) in this context.

So I think zero could be worth it if it added that. Otherwise, just allow nil to be used as a generic zero even for not-at-all-pointery things, and everything's solved. No existing code breaks, no worries about namespace.

josharian commented 1 year ago

it's easy to miss that return nil is not the same thing as return (*concrete)(nil) in this context.

Returning (*concrete)(nil) is, in my experience, almost always a bug. It even has an FAQ entry: https://go.dev/doc/faq#nil_error. So making it easier to do that doesn't seem like a priority to me.

fzipp commented 1 year ago

The only thing that is not possible today is the zero value comparison, at least not without reflect.

To create a generic zero value var zero T; return zero and *new(T) are established patterns that work. Also, return time.Time{}, err has worked for 14 years.

That's why I'm in favor of an iszero() builtin function and no zero otherwise. It fixes the one thing that is not possible today and avoids the other questions.

mitar commented 1 year ago

There is confusion today between nil pointers and nil interfaces containing nil pointers.

I think if zero does not address this, then I see little point of having zero over simply relaxing nil. I in fact would prefer if err == zero to mean that I would like err to be a nil pointer of any interface, not just nil interface. But the proposal above does not propose that, so then why we just do not relax nil?

mitar commented 1 year ago

relaxing nil causes confusion, for instance 0==nil . But 0==zero is acceptable.

I think I would find it easier to explain to novice developers why 0 == nil than why typed interface of nil pointer is not nil. And literally, nil means zero:

Nil means the same as zero.

So why 0 == nil is pretty clear. It is the same.

earthboundkid commented 1 year ago

Some have favored removing the restrictions on nil instead of bringing in the zero to Go. I'd be curious on hearing arguments against it. So far comments in here and on the slack performance channel on it have all evolved from uncomfortable feeling to quiet acceptance.

I think it would lead to more confusion in cases like np != nil && *np == nil. It's less confusing that np != zero && *np == zero might be true.

earthboundkid commented 1 year ago

But in case of generic functions, we can still do this

func X[T any]() {
  var zero T //use this for assignment
  return 
}

That works fine for return values, but it doesn't work for t == zero unless you have [T comparable].

I would almost be fine with saying you can't use zero outside of generics, except it's sort of an arbitrary restriction. I'd be happy though if something like go vet or a linter were to complain about it.

Merovius commented 1 year ago

@carlmjohnson

I think it would lead to more confusion in cases like np != nil && *np == nil. It's less confusing that np != zero && *np == zero might be true.

Did you mean to write np != nil && *np == zero, which AIUI would be the suggested idiomatic code under this proposal? Because as written, both seem exactly the same level of confusing.

AndrewHarrisSPU commented 1 year ago

Maybe gofmt should rewrite zero to nil when it can. I think it doesn't generally break anything (all nil are zero, not all zero are nil)? If it doesn't break anything, I'd rather not think about it much.

rsc commented 1 year ago

Lots of feedback, thanks. A few responses:

atdiar commented 1 year ago

Lgtm. Just wondering, for an unexported type T for which the zero value is not really usable, Can f(zero) be used? Doesn't it force the zero value to always be a valid function argument?

willfaught commented 1 year ago

I feel like zero is a great name because it results in a bit pattern of just zeros following assignment, and zero'd bit patterns are already so fundamentally baked into the language. Occasionally it is just easier to think about the bits.

Nil literally means zero, so in my opinion, nil is just as good a name in that respect; but it's even better in that it already exists for the majority of built-in types.

The only thing that is not possible today is the zero value comparison, at least not without reflect.

Returning a zero value with abbreviated notation hasn't been possible, which was a pain point, which was addressed in the proposal.

relaxing nil causes confusion, for instance 0==nil . But 0==zero is acceptable.

There's no confusion if you understand the language. Nil would be the zero value for the type, and through assignability, the type of nil in 0==nil would be int, and thus its value would be 0. The ship for avoiding the complexity of nil, assignability, and conversion sailed a decade+ ago. We might as well embrace it if we can get additional expressiveness out of it, in my opinion. We've embraced other parts of the language that are suboptimal just because they're already in the language, and the Go team didn't want to having feature overlap (e.g. interfaces and assertions vs. sum types).

I would disagree. For these many years, the default value of int was 0. If u say now that its nil . I find it hard to accept.

Nil would equal 0 for int types. In other words, it's a synonym, similar to how rune is a synonym for int32. When nil is converted to int, its meaning is 0, as in all-zero bits. It's simple, in my opinion.

I think it would lead to more confusion in cases like np != nil && np == nil. It's less confusing that np != zero && np == zero might be true.

nil would be synonymous with all-zero bits, so it seems the same to me. The meaning of zero would be just as dependent on context as nil in your example. In all contexts, the meaning of nil/zero depends on its type. It's unsurprising that p and *p would both be comparable to nil, if nil were the zero value for all types. Same for zero.

Did you mean to write np != nil && *np == zero, which AIUI would be the suggested idiomatic code under this proposal? Because as written, both seem exactly the same level of confusing.

If I understand correctly, he was demonstrating what it would look like if nil was relaxed, where np is a pointer type, which would be comparable to nil, and the pointer's element type would also be comparable to nil (since nil would be the zero value for all types). So his code is correct, if that's the case. If this proposal were adopted instead, his code would be np != zero && *np == zero, which is basically the same thing.

I think we can agree to the point that introducing zero is primarily for zero checks . So in that case why don't we handle it heads on with iszero() .

See above.

Re expanding nil, I think accidents like var x int = nil, or writing f(nil) when f takes an integer, or writing x == nil when x is an integer, or writing x == nil when x is a int but x is an int, are mistakes that are worth continuing to diagnose. So I am not in favor of allowing nil to mean any zero anywhere. In C, NULL is defined literally as 0, and it is easy to make mistakes like that. In Plan 9 C our standard headers said #define nil ((void)0), and it caught plenty of mistakes where you wrote nil and should have written an integer; NULL would not catch those, and nor would an expanded Go nil. Types are good, and we should continue to take advantage of them.

var x int = nil wouldn't be an accident, that's the point. Nil would be the zero value for all types. It's valid to use nil everywhere where the zero value is required. If you want to use special syntax for a particular type to refer to its zero value, like 0 or "", then great, but it's not required. Wherever nil appears, look at its type to understand which value it represents through assignability or conversion.

The rest of your Go examples aren't mistakes either. They would type-check and compile. There would be no problem to diagnose. x==nil for integers isn't currently valid, so new programs that use it within the new semantics would be valid, without a need to correct or diagnose.

Go isn't weakly typed, so I don't see how your C example applies. As shown above, x==nil for integers wouldn't be an error requiring a failure or diagnostic. If you don't want the zero value for int, then don't write nil. If you write nil with an int type, then you'll get 0.

I agree with the rest of your points.

fzipp commented 1 year ago
  • To people who like *new(T), all I can say is I'd personally be embarrassed to get up and tell people that's the best we can do. We can disagree of course.

It's not the best we can do, because there is var zero T. It's readable and easy to understand. Yes, its two lines, but not everything in programming has to be a one-liner. The argument for *new(T) was that it would be easier to search and replace if someday Go would get a zero builtin. However, if we decide that will never happen, then there is no reason not to just write var zero T.

mitar commented 1 year ago

accidents like var x int = nil, or writing f(nil) when f takes an integer, or writing x == nil when x is an integer, or writing x == nil when x is a int but *x is an int

Then I think nil and zero should be disjoint. So zero should be valid only for non-chan/func/interface/map/slice/pointer types. Otherwise you can have accidents like var x int = zero or writing f(zero) where f takes a pointer, or writing x == zero when x is a pointer, or writing x == zero when x is a int but *x is an int.

Or are the accidents you listed problematic but the accidents I have not?

mitar commented 1 year ago

Go isn't weakly typed, so I don't see how your C example applies. As shown above, x==nil for integers wouldn't be an error requiring a failure or diagnostic. If you don't want the zero value for int, then don't write nil. If you write nil with an int type, then you'll get 0.

I think he meant that you might write x = nil when you meant to write *x = nil (with claim being that *x = zero is clearer in the current proposal). You just made a null pointer and then the next time you do *x you get an error. That is an accident.

What I do not buy in that argument is that you could (with current proposal) easily do x = zero when you meant to do *x = zero and have the same accident. I think if we want to prevent such accidents, zero and nil should be disjoint. So that x = zero is invalid if x is a pointer.

zephyrtronium commented 1 year ago

@willfaught

The rest of your Go examples aren't mistakes either. They would type-check and compile. There would be no problem to diagnose.

The fact that those examples would type-check and compile is the problem, if they aren't what the programmer meant to write. That's the kind of mistake that strong type systems are usually good at catching. Extending nil to all types makes Go's type system less good at catching that – although exchanging zero for nil would have the same problems, and could at best be caught by convention rather than model checking.

AyushG3112 commented 1 year ago

Why not add a simple builtin function isZero(any) bool rather than a new identifier? Isn't that a cleaner way to satisfy the requirement and still keeping codebases consistent and readable?

DeedleFake commented 1 year ago

If the main issue with making zero and nil disjoint is that func F[T any](v T) bool { return v == zero } wouldn't be possible, then maybe it's best to introduce a new built-in function after all? Go 1.21 is introducing three new built-in functions primarily for dealing with NaNs correctly, so a built-in to check zero status of a value regardless of of it's zeroable or nilable doesn't seem so out of place to me anymore.

zigo101 commented 1 year ago

Go 1.21 has supported return type inference, so why not add a builtin func zero[T any] T function instead? It does need two more characters in uses, but it has not the overlapping problem brought by a zero builtin identifier.

BTW, without considering the zero() way, I prefer the extended nil way over the new zero identifier way.

... or writing x == nil when x is an integer, or writing x == nil when x is a int but *x is an int, are mistakes that are worth continuing to diagnose ...

This is not a new problem brought by the extend nil. The problem has already existed for the current nil for containers, such as *ptrToNilSlice == nil and *ptrToNilMap == nil.

AyushG3112 commented 1 year ago

I do wonder if this needs to live in the stdlib? Implementing

func isZero[T any](val T) bool

and

func zero[T any]() T

feel trivial in userland.

AyushG3112 commented 1 year ago

Additionally, I am failing to understand why we are considering zero and nil to be disjoint? Isn't nil the zero value of pointer types(and a few built-in types) from a users perspective?

AndrewHarrisSPU commented 1 year ago

func isZero[T any](val T) bool and func zero[T any]() T feel trivial in userland

isZero[T any] can't use ==, and isZero[T comparable] can't accept non-comparable types. Similarly if val == zero[T]() depends on whether T is comparable.

gophun commented 1 year ago

Maybe we should take a step back and look at the bigger picture. Why do some Go programmers want to compare against the zero value of a data type so badly? It means they want to assign a special meaning to it and treat it differently from other values. They are essentially using an in-band value of a data type as a sentinel value. Is this a good practice? Wouldn't it be better to have a way to express the non-existence of a value instead? I fear that this proposal or other suggested variants would encourage the practice of assigning special meaning to zero values.

Merovius commented 1 year ago

@gophun I strongly disagree with that chain of argument. The example I brought up here didn't treat the zero value as a sentinel or a stand-in for "non-existent" at all. It was a bog-standard value.

I also don't understand how you imagine "a way to express the non-existence of a value", on an implementation level. ISTM there are three ways that can be implemented:

  1. By using a pointer, with nil meaning "non-existence". That is very cache-inefficient.
  2. By pairing it with a boolean to store validity (like sql.NullString etc). That is bad for alignment, adds significant storage overhead and is thus again bad for caches.
  3. Designate one of the legal values of a type to mean "non-existence". That's pretty much this proposal. It essentially designates the zero value that way.

Also, no matter which of these you choose - in all three of them, you end up with the semantic of "the zero value means there is no value" and ISTM that any such design would have to have that semantic: You definitely want var x T to be the "non-existent" value, if there is one.

Obviously, there are types for which the zero value does not naturally imply "there is no such value". But those can always choose to go with one of the first two strategies, even under this proposal - and they will work fine with code that compares against the zero value.

So, really: Even under that chain of argument, ISTM that this proposal (or, FWIW, the iszero(v) builtin) is the natural point to arrive at.

fzipp commented 1 year ago

The only thing that is not possible today is the zero value comparison, at least not without reflect.

Returning a zero value with abbreviated notation hasn't been possible, which was a pain point, which was addressed in the proposal.

@willfaught What I meant is that abbreviated notation is just sugar, it does not add functionality that was not possible to express before. The only aspect that adds functionality is the zero comparison. In my opinion the syntactic sugar part does not justify opening a can of worms. I would prefer if we can add the zero comparison while avoiding the rest of the contentious questions.

rsc commented 1 year ago

Thanks for the continued feedback. I am sensing maybe a little bit of urgency to get their points across from some posters, so let me try to defuse that a bit. This discussion is not going to resolve in the next couple days - it hasn't even appeared in the proposal minutes yet.

Here are some responses to recent comments.

Merovius commented 1 year ago

FWIW it might be helpful (for the "adding zero" vs. "expanding nil" discussion) to emphasize that the intended outcome of this proposal is that people continue to use nil if they know something is a pointer/interface/map/channel, at least for the most part.

That is what I said here. It would be legal to write p != zero && *p == zero, but it would be idiomatic to write p != nil && *p == zero. So idiomatic code would be protected by the type-checker from at least some mistakes. If we expanded nil comparisons/assignments, it would be both legal and idiomatic to write p != nil && p == nil, which is not protected by the type-checker in the same way.

I think that's the primary difference between "adding zero" and "expanding nil" and arguments that rely on both of these making it legal to write traps miss that argument of convention.

We might be able to nudge/enforce these conventions with vet or other static checks, at least for some common cases.

fzipp commented 1 year ago

We might be able to nudge/enforce these conventions with vet or other static checks, at least for some common cases.

We should just be sure that if a new feature prompts the addition of vet checks or style guide items, it's worth it.