golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.84k stars 17.51k forks source link

proposal: Go 2: universal zero value with type inference #35966

Closed geraldss closed 8 months ago

geraldss commented 4 years ago

I propose a universal zero value with type inference. Currently nil is a zero value with type inference for pointers and built-in reference types. I propose extending this to structs and atomic types, as follows:

{} would represent a zero value when the type can be inferred, e.g. in assignments and function call sites. If I have a function:

func Foo(param SomeLongStructName)

and I wish to invoke Foo with a zero value, I currently have to write:

Foo(SomeLongStructName{})

With this proposal, I could alternatively write:

Foo({})

For assignments currently (not initializations; post-initialization updates):

myvar = SomeLongStructName{}

With this proposal:

myvar = {}

This proposal is analogous to how nil is used for pointers and reference types.

The syntax allows type names and variable types to be modified without inducing extraneous code changes. The syntax also conveys the intent "zero-value" or "default" or "reset", as opposed to the actual contents of the zero value. Thus the intent is more readable.

mdempsky commented 4 years ago

@beoran If it doesn't take any parameters and doesn't have any side effects, no point in making it a function. It can just be a value, like nil.

earthboundkid commented 4 years ago

I mention above that a magic function would help with generic programming but not typing out return values or nil interface/pointer confusion.

cosmos72 commented 4 years ago

As @ianlancetaylor pointed out, T{} cannot be extended to mean 'zero value of T' because if T is a map type or a slice type, T{} already has a meaning: create a non-nil, zero-elements map (or slice). Instead the zero value of such types is nil.

Using _ could be an elegant solution in places where type inference can deduce the correct type, but if I understand correctly it has already been rejected.

Two more ideas that I did not see yet, which leave out the 'type inference' part, are:

  1. use T() to mean zero value of T - currently does not compile, thus Go compatibility promise is preserved. It's also not unheard of: in C++ it means either 'default constructor' if T is a class, and 'value initialization' otherwise - both are some approximation of 'zero value'

  2. define a new compiler builtin 'zero(T)'

Allen-B1 commented 4 years ago

*new(T) works as a universal zero value (albeit requiring explicitly the type), which solves #2

earthboundkid commented 4 years ago

You can’t compare func/map/slices to it though, which is a limitation compared to a predeclared constant.

rogpeppe commented 2 years ago

For the record, I'm not that keen on nil as a universal zero for the reason that @carlmjohnson states earlier in this thread: nil is currently quite specific and widening its scope means that the compiler can do less to catch errors early.

In general, whenever people in this thread have used an identifier for the zero value, they've used zero. I think that's significant: although one character longer than nil, I think that zero is clear and unambiguous compared to (say) {} or _, so that would be my choice for an addition to the language.

For semantics, after type inference, I'd allow any operation that would be allowed on an explicit zero value of the inferred type with the addition of comparison for equality. It would be an error if the type cannot be inferred from its surrounding context.

So this would be valid:

func isZero[T any](x T) bool {
    return x == zero
}

it would be equivalent to this currently valid code:

func isZero[T any](x T) bool {
    return reflect.ValueOf(&x).Elem().IsZero()
}
earthboundkid commented 2 years ago

It's interesting to come back to this thread now that generics are (almost) here. I wrote a package to test if values are equal to the zero value, and had to use reflection to make it work because of the lack of generic type switching and a universal zero value. Ideally, there would be some way to do

switch T.(type) {
case lenable:
   return len(v) == 0
default:
   return v == zero
}
gopherbot commented 2 years ago

Change https://go.dev/cl/360015 mentions this issue: crypto/elliptic: use generics for nistec-based curves

earthboundkid commented 2 years ago

Can this issue be renamed to be specifically about @rogpeppe's zero built in constant?

ianlancetaylor commented 2 years ago

It might be better to make that a new proposal, since almost all the discussion in this proposal is about something different than that.

earthboundkid commented 2 years ago

52307 was just closed as a duplicate of this one. It could be reopened.

ianlancetaylor commented 2 years ago

Well, I suppose maybe it is fairly similar. I dunno.

rogpeppe commented 2 years ago

FWIW I've been thinking for a while that one would get used to _ as a zero value pretty quickly, so I'd be just as happy if that were adopted instead of zero.

Also, to me #52307 does indeed seem like a slightly more specific duplicate of this one.

bradfitz commented 2 years ago

Underscore for zero value was a good idea 10 years ago: https://groups.google.com/forum/#!msg/golang-dev/iAysKGpniLw/qSbtBUx4-sMJ (from @nigeltao)

At the time everybody said "not now". Maybe it's time!

earthboundkid commented 2 years ago

I have a slight preference for zero, but either is fine. _ has the advantage that it can't be shadowed. I mostly just want to cut out the reflect code from my libraries. :-)

earthboundkid commented 2 years ago

Adding a link to #19642 where _ was previously rejected. I think many of the considerations against the spelling _ still apply, but the need for a generic zero is new, and so zero should be evaluated now.

_ can possibly be reevaluated, but the points about _ being confusing still apply. Take this snippet:

x := 1
_ = x
x == _ // false

That's confusing. With the spelling zero, zero = x either won't compile (because zero is not defined) or it will be written zero := x in which case local var zero will shadow the constant zero, so x == zero would be true. It's still a little confusing that zero is universal, but that's a common property to untyped constants.

willfaught commented 2 years ago

@seankhliao marked #53666 as a duplicate of this issue, and said its discussion should continue here. Since people are unlikely to follow the link, here is the relevant discussion from that issue with some minor details from the proposal elided:


@willfaught:

What is the proposed change?

Currently, if I understand correctly, there's no expression for the zero value for a type variable:

type Map[K comparable, V any] struct {
    ks []K
    vs []V
}

func (m Map[K, V]) Get(k K) V {
    for i, k2 := range m.ks {
        if k2 == k {
            return m.vs[i]
        }
    }
    return zeroValue // cannot currently express this
}

This is a trivial example, but I've seen real questions about what to do in these situations.

Currently, if I understand correctly, the only way to do this is to declare a variable, and return that:

var zeroValue V
return zeroValue

Why not allow nil to be used as the zero value for type variables, to fill this gap?

return nil // == V(nil)

At runtime, nil would be the zero value for the specific type argument.

Nil could actually be interpreted as the zero value for every type, even outside of generics. The word "nil" means zero, anyway. This would be handy in situations where you make a type a pointer just to avoid lots of typing. For example:

if condition1 {
    return ReallyLongStructName{}, fmt.Errorf(...)
}
if condition2 {
    return ReallyLongStructName{}, fmt.Errorf(...)
}

Instead, you could keep the non-pointer type, and then do:

if condition1 {
    return nil, fmt.Errorf(...)
}
if condition2 {
    return nil, fmt.Errorf(...)
}

It would also solidify the type abstraction concept of generic functions, where functions varying only in concrete types can be abstracted into one generic function with type variables. For example:

// Same implementations, different types

func (m MapIntInt) Get(k int) int {
    for i, k2 := range m.ks {
        if k2 == k {
            return m.vs[i]
        }
    }
    return nil // nil, not 0, but means the same thing for int
}

func (m MapStringString) Get(k string) string {
    for i, k2 := range m.ks {
        if k2 == k {
            return m.vs[i]
        }
    }
    return nil // nil, not "", but means the same thing for string
}

// Same implementation, abstracted types

func (m Map[K, V]) Get(k K) V {
    for i, k2 := range m.ks {
        if k2 == k {
            return m.vs[i]
        }
    }
    return nil // nil is 0 for int and "" for string
}

Similar to how generics required an inversion of the meaning of interfaces to be type sets, perhaps generics also requires an inversion of the meaning of zero and nil values, where every type has a nil (default) value, and for number-like types, that just so happens to be the number zero, but for other types, it could be whatever makes sense for them.

Before

var x T
ret = x
return LongStructName{}, err

After

ret = nil
return nil, err

Would this change make Go easier or harder to learn, and why?

Easier, because nil is already meaningful for about half the built-in types. This would add consistency to the language by making nil meaningful for all types, using a meaning ("zero value", "unassigned variable") already applicable to all types.

What is the cost of this proposal? (Every language change has a cost).

Enabling nil to be assignable to all types, and making it equivalent to the zero value for each type.

Minor changes to the language spec to reflect that.

This would decrease the cost of understanding the language in terms of consistency for semantics.

It's a net win, in terms of cost. If you disagree, please answer these questions in your response:

Can you describe a possible implementation?

Substitute nil with the zero value for the type, e.g. 0, "", T{}, T(nil), etc.

All I ask

I didn't encounter counterarguments or alternatives in golang-nuts or Reddit that suggested a better or even viable alternative to the problem presented here. The problem identified is the lack of an identifier that means the zero value for a type. Please don't suggest, as an alternative, a non-identifier expression that gc will currently evaluate efficiently to the same value, as that doesn't solve the problem identified. This proposal is about amending the Go language, not a Go implementation like gc.

Before you respond, please consider whether the point you're responding to is the heart of the idea being proposed here. In my opinion, counterarguments should first seek to steelman their target.

A proposal is an argument. As such, this is a debate. Please participate in good faith, with rebuttable/refutable/falsifiable counterarguments, and by acknowledging (something like "I agree" or "I disagree because...") all the points made by the people you're responding to. Please don't "swoop" in with a counterargument, then "swoop" out immediately thereafter, and not stick around to defend it.


@robpike:

Nil is already confusing enough for many Go programmers. This proposal adds another meaning, introducing more options for confusion.

The issue you want to solve may indeed be worthwhile, but please not by yet another meaning for "nil". Maybe it's time for a new predeclared identifier, say "zero", which represents the zero value for any type. That would avoid the bizarre property of being able to assign nil to an int in a polymorphic function but not elsewhere. The identifier zero would work everywhere.


@seankhliao:

Duplicate of https://github.com/golang/go/issues/35966


@willfaught:

@seankhliao I don't agree that this issue is a duplicate of that issue. That issue attempts to introduce an expression for the zero value for all types, but it introduces new {} syntax to do it. This proposal expands the meaning of the nil identifier to solve that problem in a way that's consistent with current nil semantics.

While there is some talk in that discussion about the ideas in this issue, that issue hasn't been changed to reflect that direction.

Until that proposal is approved, I don't think related-but-different issues like this one should be closed.

@robpike Nil is confusing because it's not explained clearly in the Go spec, in my opinion.

From #Variables in the spec:

unless the value is the predeclared identifier nil, which has no type

Nil is not a value; it's an identifier that represents the zero value for some type. nil is not a slice value; []T(nil) is a slice value. The nil concept, syntax, and semantics often get muddled in the wording. As such,

var x interface{} // x is nil and has static type interface{}

is technically wrong. x is interface{}(nil), not nil.

var v T // v has value nil, static type T

is technically wrong. v has value (*T)(nil), not nil.

There's a difference between referring to nil (the conceptual default value for a type), and referring to nil (the identifier). nil values don't exist, but values like T(nil) do exist.

The spec has several instances of this confusing wording:

The value of an uninitialized slice is nil. The value of an uninitialized pointer is nil. The value of an uninitialized variable of function type is nil. The value of an uninitialized variable of interface type is nil. The value of an uninitialized map is nil. The value of an uninitialized channel is nil.

To pick on slices again, it should be "The value of an uninitialized slice type T is T(nil)," or "The value of an uninitialized slice is nil" (assuming the general idea of nil (not nil) is explained somewhere).

This wouldn't be an issue, except that assignability transparently interprets nil as the zero value for the type, which is when you run into confusion about interface{}(nil) and interface{}((*int)(nil)). This could be easily clarified by adding a small section to the spec that directly addresses this subtlety. It could also be clarified by rewording the spec in these places to remove the subtlety.

It would be equally bizarre to assign zero to a struct. What is "zero" about a Person struct? Or a Name string? Nothing. That was my point about the dictionary definition of nil above: it's associated with zero, but it has historical associations with default (null) values, so it's the perfect term for the universal concept of the default value. "Zero" is a numeric concept. "Nil" is not tied to anything in terms of the English word, so it works more broadly.

The only reason why the term "zero value" was chosen was because that's how it's implemented under the hood, it seems to me. But that detail doesn't have to limit how we conceptualize it. "Zero value" just clouds the concept, in my opinion. "Default value" should be what we talk about. Numbers have 0, strings have "", and so on.

Forgot to add this:

In addition, I think zero (the identifier) would have the same confusion as nil (the identifier) in terms of assignability:

var p *int = nil // p is (*int)(nil)
var p *int = zero // p is (*int)(zero)

var x interface{} = nil // x is interface{}(nil)
var x interface{} = zero // x is interface{}(zero)

x = p // x is interface{}((*int)(nil))
x == nil // false

x = p // x is interface{}((*int)(zero))
x == zero // false

@seankhliao:

There are many different syntaxes proposed in the comments in the issue (or closed as dups of), {} is only prominent for the first half. The core idea is the same, a single identifier for what we now call the zero value. Please continue the discussion there, where you'll have the attention of everyone who's already thought more about it.


@willfaught:

@seankhliao That would seem to conflict with @ianlancetaylor's own reasoning in that discussion:

@carlmjohnson: Can this issue be renamed to be specifically about @rogpeppe's zero built in constant? @ianlancetaylor: It might be better to make that a new proposal, since almost all the discussion in this proposal is about something different than that.

That issue has a broad title that doesn't reflect the specifics being proposed. Closing this issue as a duplicate of that issue would be like closing all generics proposals that came after the first one ever filed as duplicates because the first one happened to have a broad title.

I sympathize with wanting to condense the discussion into fewer places, but the right way to do that, in my opinion, would be to clear that issue's original description, and replace it with a summary of all the proposed ideas so far. As it currently stands, pasting this proposal into that old discussion as a comment would bury it, and not give it its fair shot at vetting and discussion.

cosmos72 commented 2 years ago

To my surprise, I find myself agreeing with the proposal above to use nil as the universal zero value.

Generics have a strong tendency to highlight any irregularity or special case in the type system of a language, exactly because generic code "abstracts over types". And they clearly highilighted that nil is both irregular and a special case: it's a valid value for certain types (chan, func, map, pointer, slice) but not for others (arrays, integers, floats, complexes, string, struct)

Due to Go compatibility promise, nil cannot be removed. The lesser evil is then to make it regular i.e. a valid value for all types. And the only way to make it regular (i.e. not full of special cases) is extending its current meaning to all types, making it assignable to every type as the zero value of such type.

Until now I thought that inventing a new symbol zero as the universal zero value for every type would be better, but then beginners would need to learn about both zero and nil, and the differences between them - definitely an unnecessary complication for them, and for the Go language. Reusing nil as the universal zero value avoids both problems: there is a single symbol nil that means the zero value, and it's regular i.e. valid for all types, with no special cases.

About @robpike comment that nil is already confusing enough as it is: @willfaught correctly pointed out above than nil is confusing both because it's not explained clearly once and for all, but rather case-by-case, and because nil is irregular (it's assignable to certain types but not to others).

Seeing nil used as numeric zero, as for example var foo int32 = nil instead of var foo int32 = 0 may be strange in the beginning for long-time Go programmers, but as @ianlancetaylor pointed out for another topic (I don't remember exactly what, sorry), programmers' habits can change quite quickly and they can get used to the new style, finding it natural after a while. Especially if the new style is about not having special cases anymore.

A last thought: of course I back the proposal to allow nil as the zero value of every type both inside and outside generics - doing anything else would introduce an irregularity in the language.

atdiar commented 2 years ago

I think that would be a little bit of a stretch from a semantic point of view. Zero is essentially a generic function equivalent to:

func Zero[T any] (v ...T) T{
   var z T
   return z
}

A string is not nilable. Why could it be nil?

It was brought to my attention in another issue that the distinction matters perhaps even more in generic functions:

It's often the case for example that methods cannot be called on nil zero values without panics. So there might easily be an actual difference between a string and a map in terms of semantics of their respective zero value. https://github.com/golang/go/issues/53656#issuecomment-1179422413

But as of now, type parameters are oblivious to that.

cosmos72 commented 2 years ago

Well, currently nil means "null pointer" and is usable for pointer-like types, or for types that are internally implemented as pointers (although that's not visible to Go code): pointers, slices, channels, maps, functions.

It's true that strings, structs and numbers cannot be nil i.e. they cannot be the null pointer. That's why the proposal implies changing the meaning of nil from "null pointer" to "zero value" or more abstractly "default value".

As you correctly point out, of course the functions/methods allowed on nil converted to a concrete type, and their behavior, depend on the concrete type:

This does not preclude the possibility of giving a meaning to nil when converted to other types - it's just that numbers and structs are not pointers, thus the intuitive meaning "nil is the null pointer" would not be appropriate for them - a more general meaning "zero value" or "default value" would be needed.

P.S. a slightly pedantic addition: "nil" is Latin for "nothing". The current Go choice is to use it only for pointer and pointer-like values - semantically it has a much wider meaning.

atdiar commented 2 years ago

Yes, the concept of zero subsumes nil. That's why some think it would be confusing to equate the two.

(another difference might be in terms of allocation behavior although that's possibly an implementation detail)

earthboundkid commented 2 years ago

I am sympathetic to the “nil everywhere” view. I think that if it is adopted, we also need the nilinterface etc proposal, and go fix can rewrite nil to be the most specific nil possible. So, s == nil would be rewritten to s == "" for example, and m == nil would be rewritten to m == nilmap. The only things that couldn’t be rewritten would be generics, and everyone would be happy.

The two problems with this are that I think there’s a strong association between nil and pointers, which would be confusing to lose, and the reflect package has IsNil which would need a breaking change.

andrea69pablo commented 2 years ago

Andrea69Pablo

willfaught commented 2 years ago

@atdiar:

Zero is essentially a generic function equivalent to:

func Zero[T any] (v ...T) T{ var z T return z }

A string is not nilable. Why could it be nil?

I don't understand the point. The argument is to generalize the meaning of nil to be the zero/default value for every type, in which case strings would be "nilable" like every other type.

It's often the case for example that methods cannot be called on nil zero values without panics.

It depends on the zero value and the method implementation.

So there might easily be an actual difference between a string and a map in terms of semantics of their respective zero value.

Right. Default values can mean different things and behave different ways, depending on the type. What unites them is that they are what you get in an uninitialized variable:

The value of an uninitialized variable of function type is nil. The value of an uninitialized variable of interface type is nil.


@cosmos72:

Well, currently nil means "null pointer" and is usable for pointer-like types, or for types that are internally implemented as pointers (although that's not visible to Go code): pointers, slices, channels, maps, functions.

How gc implements those types doesn't seem relevant. According to the Go spec, nil only means:

The value of an uninitialized slice is nil. The value of an uninitialized pointer is nil. The value of an uninitialized variable of function type is nil. The value of an uninitialized variable of interface type is nil. The value of an uninitialized map is nil. The value of an uninitialized channel is nil.

So, basically, it's the uninitialized value for a type, which is what you get in an uninitialized variable. Every type has an uninitialized value, but not all types have an initialized value. Nil can be uninitialized values for all types.

It's true that strings, structs and numbers cannot be nil i.e. they cannot be the null pointer. That's why the proposal implies changing the meaning of nil from "null pointer" to "zero value" or more abstractly "default value".

Strings do contain a pointer, just like maps, channels, etc.


@carlmjohnson:

I am sympathetic to the “nil everywhere” view. I think that if it is adopted, we also need the nilinterface etc proposal, and go fix can rewrite nil to be the most specific nil possible. So, s == nil would be rewritten to s == "" for example, and m == nil would be rewritten to m == nilmap. The only things that couldn’t be rewritten would be generics, and everyone would be happy.

I don't follow why various nil* values would be needed in that case. If your concern is about the general confusion there is about e.g. nil pointers vs. nil interfaces, wouldn't clarifying the Go spec address that issue, as I sketched out?

earthboundkid commented 2 years ago

Users today are routinely confused why var e myError = nil; var err error = e means that e is nil but err is not nil. Saying e is nilptr but error is not nilinterface clears that confusion up. People would be even more confused why you can dereference an int pointer without panicking and compare to nil and get true (because 0 would equal nil). Using the most specific nil in every case clarifies things.

willfaught commented 2 years ago

Right, so you're referring to the general confusion about nil and assignability. I argued that can be fixed by rewording/restructuring the spec to clarify the issue. I demonstrated several ways in which the spec wording is confusing. Do you think that won't be sufficient? If so, why?

ianlancetaylor commented 2 years ago

@willfaught The confusion is specifically that assigning a pointer that is equal to nil to a variable of interface type gives you a value that is not equal to nil. I don't see how rewording the spec, which is not how most people learn the language, will help with that confusion.

AndrewHarrisSPU commented 2 years ago

What about:

Extend make(T) such that:

Additionally, extend make(T, nil) such that for any type, zero := make(T, nil) is equivalent to var zero T

Some implications, if I'm not overlooking anything:

1) The universal zero is make(T, nil). The universal empty (as in "empty set") is make(T). Both of these ideas have utility in generic routines. Particularly, comparisons that involve either will differ. The former will always fail for T such that using T can lead to a nil dereference. The latter will fail to compile for comparisons with some tricky cases one might wish to explicitly avoid.

1) This could be compatible with non-generic code but not preferred (IMO outside of generics, the pre-generics make() behaviors would be more precise, suggestive of intent, and readable) or just illegal outside of generic routines.

1) Any variable declartion var [IDENT] [TYPE] (e.g., var zero T) can be rewritten with make. As a though experiment, one could imagine making var zero T illegal in generic routines. It would be impossible to then reintroduce var zero T and eliminate the universal empty make(T). It'd be a different Pareto maximum as far as what's in the language versus what might be desired; I do wonder if it would be a simpler place overall. On this point, I'd like to note a recent blog post Daniel Lemire on summing numbers:

at least in this one instance, Go generics are more expressive than Java generics ... So far, I am giving an A to Go generics.

In contrast, @griesemer has recently said for anything just a little more involved:

with our "simple" form of generics, people seem to not really appreciate the immense complexity of code they permit ... A major reason for Go's success is its relative simplicity, which includes the simplicity of its type system. Generics, while asked for by many people, really has put a dent into this.

I think there's a fairly steep curve from the most straightforward uses of generics, and those that tangle with the subtleties of nil and comparable. Maybe ways to opt out of or be more direct about variable declarations can help.

atdiar commented 2 years ago

@willfaught I think the difference between a zero value and nil is that nil comes with additional meaning. Colloquially, something that is nil points at nothing, uninitialized memory, invalid state etc. That's why it can be used as a sentinel value, for instance when unmarshalling. This is a form of typestate.

The zero value for string types does not allow for similar usage because it is a perfectly valid string.

That can perhaps be more easily seen in the difference between a nil map and an empty map.

willfaught commented 2 years ago

@ianlancetaylor:

I don't see how rewording the spec, which is not how most people learn the language, will help with that confusion.

In the spec, change:

var x interface{} // x is nil and has static type interface{}

to:

var x interface{} // x is interface{}(nil), not nil, and has static type interface{}

Change:

var v T // v has value nil, static type T

to:

var v T // v has value (T)(nil), not nil, static type *T

Change:

The value of an uninitialized slice is nil. The value of an uninitialized pointer is nil. The value of an uninitialized variable of function type is nil. The value of an uninitialized variable of interface type is nil. The value of an uninitialized map is nil. The value of an uninitialized channel is nil.

to:

The value of an uninitialized slice type T is T(nil), not nil. The value of an uninitialized pointer type T is T(nil), not nil. The value of an uninitialized variable of function type T is T(nil), not nil. The value of an uninitialized variable of interface type T is T(nil), not nil. The value of an uninitialized map type T is T(nil), not nil. The value of an uninitialized channel type T is T(nil), not nil.

Insert a new paragraph in #Types that explains how every type has a default, uninitialized value, expressed as T(nil), which a raw nil is assignable to, and how there are infinite distinct nil values, e.g. []int(nil), []byte(nil), (*int)(nil), (func())(nil), interface{}(nil), etc.

Etc.

Put a full Go reference on go.dev that explains the subtlety. I don't see any page on go.dev that explains all the info you can get on https://learnxinyminutes.com/docs/go/. I had to learn about the nil subtlety years ago on some guy's blog because the official documentation just didn't explain it.

Highlight that subtlety on http://go.dev/index.html. Reach out to tutorial writers and ask them to include it. Explain it in conference talks. Write a blog post once a year. Hire billboards and sky writers. There are hundreds of ways to address and clarify this.

The objections to this seem to boil down to something like, "A lot of our users weren't taught Go correctly or completely, and we don't know how to fix an education or training issue." Just facilitate education and training for this issue, and for Go in general. How to improve education and training is a well-understood problem. Nil isn't a footgun.

At the end of the day, nobody cares if I write "package p; func F() {}" into a Java compiler and then complain that it doesn't work. There's an expectation that you've RTFM and know Java. It's no different for Go: Learn Go. Not all nils are the same. If there's no FM (manual) for them T R (to read), then that's our fault, but fortunately that's fixable.

"I'm putting nils all over the place, and it's not working. Help!"

"Did you read go.dev/doc/language-reference? It explains everything about Go, including all the subtleties and nuances. It addresses your situation. Don't bother reading any other tutorial. This one is official, and is updated with every official release. In fact, every certification for learning Go requires you to read it because it's so clearly written, concise, and complete. They actually turned it into a short book, it's that good."


@AndrewHarrisSPU:

Extend make(T)

So on one hand, we have adding a universal uninitialized value for every type (nil/{}/zero/something), and on the other hand, we have adding a universal initialized value for every type (make). int(nil/{}/zero/something) vs. make(int). I guess I would say that the idea of uninitialized values makes sense for every type, but perhaps the idea of initialized/constructed values does not. E.g. talking about initialized ints vs. uninitialized ints.


@atdiar:

I think the difference between a zero value and nil is that nil comes with additional meaning. Colloquially, something that is nil points at nothing, uninitialized memory, invalid state etc. That's why it can be used as a sentinel value, for instance when unmarshalling.

Pointer is just a sum type: it's nil, or it's an address. Some languages have Option/Optional/Maybe types that can be used in the same way. If Go users could declare their own sum types, they wouldn't have to use pointer types at all for this case.

Nil pointer values aren't special in general, but they are special in Go, since Go doesn't have generalized sum types, and most serialization formats don't have a notion of pointers, so Go pointers can generally be used to detect whether an optional value is present or absent.

However, this is framing nil as only a pointer value, and it's not. Half the built-in types have a nil value. I don't think it's a good idea to guide the direction of the language based on the fact that apparently some users don't know that a function value can be nil.

atdiar commented 2 years ago

I don't understand what you mean by a pointer being a sum type. Also, nil is perhaps better described as the zero value of reference types although this is not a notion that appears in the spec. Shouldn't change the main point expressed initially.

AndrewHarrisSPU commented 2 years ago

@willfaught

I guess I would say that the idea of uninitialized values makes sense for every type, but perhaps the idea of initialized/constructed values does not. E.g. talking about initialized ints vs. uninitialized ints.

Suppose a bitset package with generic routines over bitset.Tiny int-based sets, bitset.Vector slice-based sets, bitset.Sparse map-based sets, etc. bitset.NewTiny, bitset.NewVector, bitset.NewSparse are easy to write because we can use the particular syntax matching each type. bitset.New[T] isn't straightforward. var zero T works for something like bitset.Oops[T] returning a nil-ish value.

For ints, it's sort of a "don't care" on the logic tables (more product-type than sum-type, I guess):

var zero T is       empty   |   zero/nil
Tiny:               true    |   true
Vector:             false   |   true
Sparse:             false   |   true

make(T) is          empty   |   zero/nil
Tiny:               true    |   true
Vector:             true    |   false
Sparse:             true    |   false

make(T, nil) is     empty   |   zero/nil
Tiny:               true    |   true
Vector:             false   |   true
Sparse:             false   |   true

If the goal is generic routines that reliably procure empty sets, or reliably procure nil sets, for all T,

(* I think we should care that for some T, we actually can't infer a valid empty and it should be rejected at compile time - it'd be extraordinary and strict, but also not confusing or surprising)

(also make(T), make(T, nil) would break some syntax rules ... maybe makeEmpty[T](), makeNil[T]() /shrug)

willfaught commented 2 years ago

@atdiar Sum type, a.k.a. tagged union, variant, variant record, choice type, discriminated union, disjoint union, or coproduct. Some languages call them enumerations, like Rust and Swift. I listed the Option/Optional/Maybe types as examples. Pointers have two cases: an address, or no address. Go layers memory storage and retrieval semantics on top of its pointer sum type. You can choose to see it from that perspective, or not.

Anyway, the point was that there are cases to pointer values, and I was addressing your example of using pointers for inferring whether a value is present or absent in serialized data.

Also, nil is perhaps better described as the zero value of reference types although this is not a notion that appears in the spec.

It's not for strings, which are a "reference" type as you mean here.

Shouldn't change the main point expressed initially.

What shouldn't?


@AndrewHarrisSPU I don't follow what bitset.Oops[T] is or does.

How are you defining "empty" here? Why would make(T, nil) not be empty for a slice type, and what would it contain?

Extending make to all types sounds comparable to extending nil to all types, but every type already has a zero value property, so extending nil is really just a change in notation, not adding another concept/aspect to the language like emptiness. Extending make sounds like a larger change. Which isn't bad per se, but that should be weighed.

Your example has us trying to write a generic constructor for a package's types, but is that much different than trying to write a generic constructor for all types? Wouldn't func New[T]() T { return make(T) } work for all types? We wouldn't even need New, we could just call make directly.

Perhaps trying to fit all types into a design for generic built-in types that existed before generalized generics is complicating matters. Since the map type, for example, was built-in, it didn't belong to a package, so its constructor couldn't belong to a package either, so the designers had to add a make constructor for maps.

If we were to add a map type to Go today using generalized generics, we wouldn't have to build in a special type and a special constructor. We could add a normal type to a normal stdlib package, along with a normal constructor function. It would probably be a struct type, so there'd be no inherent emptiness about it, except as expressed by its Len() int method. So perhaps trying to make all built-in types fit into make is the opposite of what we want; perhaps the types that do fit into make now should be retrofitted to be normal types, with normal constructors.

Edit: Wording in last paragraph.

atdiar commented 2 years ago

I think there is some confusion here. In Go, a string has value semantics so to speak. So what doesn't change is that nil is the zero value for types with reference semantics. It basically "means" pointing to nothing. Not pointing to something empty (a legit value).

Are you sure changing what nil means would be helpful?

Re. Sum types, I still don't understand what you mean. A type is defined as a set of values sometimes. A pointer in Go has a type which applies to all the values it can take, including nil.1

In theory you could perhaps define a subtype of a pointer type which specify that any value whose type is this subtype is nil. Then the union of this subtype and its "complement" would be the full pointer type. (assuming we define a type as being a set of values). But for a string type, what does a Maybe(string) mean? or {string | nil} ?

Here again, that's confusing.

AndrewHarrisSPU commented 2 years ago

@willfaught

Why would make(T, nil) not be empty for a slice type

A good answer on StackOverflow

Wouldn't func New[T]() T { return make(T) } work for all types?

New[T] would be equivalent to make(T). Like with Oops[T], to clarify my point - writing a package that involves returning empty and nil variants of T is not straightforward at the moment.

The current way things are set up isn't wrong, no make(T) can be done for all types. Sometimes a method or a callback or something else needs to be there. Speculatively, I think a generic make(T) that fails to compile for inappropriate T would be useful enough. Also subjectively, I think make(T) expresses intent well (the lack of being able to express this lurks around discussions of a universal zero, IMHO). And, someone writing a generic package, or writing to a generic package API, gets some free static checking that it makes sense to use this kind of allocation-is-initialization T.

willfaught commented 2 years ago

@atdiar:

I think there is some confusion here. In Go, a string has value semantics so to speak. So what doesn't change is that nil is the zero value for types with reference semantics. It basically "means" pointing to nothing. Not pointing to something empty (a legit value).

Apparently, I guessed wrong what you meant by reference type. Since Go has no types it calls reference types (in the spec), and there are no values passed by reference in Go semantics (in the spec), let's avoid that terminology here. In the gc implementation, types like maps are implemented as pointer types, which are value types. Every type is a value type, including pointers. So what did you mean by reference type, in terms of Go semantics (in the spec)? This seems like it might boil down to a difference in how we define fuzzy terms (for Go) like this.

Are you sure changing what nil means would be helpful?

I don't think it would change what nil means per se, as I addressed above; I'm proposing including other types in its meaning. From the spec:

The value of an uninitialized slice is nil. The value of an uninitialized pointer is nil. The value of an uninitialized variable of function type is nil. The value of an uninitialized variable of interface type is nil. The value of an uninitialized map is nil. The value of an uninitialized channel is nil.

Nil, broadly speaking, is the value of an uninitialized variable. Every type already has such a value, we just don't currently allow use of nil for every type.

There are multiple perspectives about nil. Nil can have a different meaning in each perspective that is consistent within that perspective, but contradicts other perspectives. Arguing that one perspective is bad or less valuable because it contradicts another isn't useful, in my opinion. The argument is that a new perspective, that contradicts one or more other perspectives, has more value. I'm all ears if this new perspective would be inconsistent with itself or existing semantics in some way, which would be a great counterargument against it.

The recent addition of generalized generics required a similar shift in perspective for the meaning of interfaces, to one where they define a set of types. Type sets were compatible with existing semantics, while expanding their expressive power, much like what I'm arguing for with nil. Arguing that adding type sets would be bad because interfaces already had another meaning would have needlessly tied the hands of the designers from extending and enriching the language.

Re. Sum types, I still don't understand what you mean.

I can't add more without knowing what exactly doesn't make sense. You started with saying that nil has a special meaning of "pointing to something (or nothing)":

I think the difference between a zero value and nil is that nil comes with additional meaning. Colloquially, something that is nil points at nothing, uninitialized memory, invalid state etc.

I responded with two points:

What specifically didn't make sense?


@AndrewHarrisSPU:

A good answer on StackOverflow

Ah, I see what you mean: the non-nil, empty slice value.

I don't see how the non-nil, empty slice value for make(T) is useful. It could return the zero value and still be "empty."

no make(T) can be done for all types

I assume you're referring to the interface and function cases in your proposal?

Extending make, with those type exceptions, with the complications arising from nested pointers in structs, seems complicated. Those complications could be simplified by having make(pointer type) result in the zero value (which would just result in the zero value for structs as well), but that would still leave the type exceptions. I'm not sure what we're gaining for that extra complexity. The ability to initialize only some built-in types, and be a no-op for all other built-in types?

Perhaps a simpler approach would be a new constraint that makes the make "operation" available to matching type variables? This wouldn't change which types make works with, and it would enable make inside generic functions.

As I wrote above, this wouldn't match the general approach to initialization in Go, where types have an initialization method, or an associated constructor function. In those cases, either the generic code must make do with default values, or the user must pass in the value they want the generic code to work with. I don't see why built-in types shouldn't be consistent with that.

willfaught commented 2 years ago

@AndrewHarrisSPU I just realized that make(slice type) isn't currently permitted, so make would also have to be altered to permit it. I assume it's currently not permitted because it's useless.

beoran commented 2 years ago

@AndrewHarrisSPU

The nil keyword has a very venerable tradition of being the zero value for pointers in several programming languages, especially in Wirth language https://wiki.freepascal.org/Nil

The make() solution seems more Go like to me, but a zero() built in function might be even easier to read.

atdiar commented 2 years ago

@willfaught My issue is that it appears difficult to see a string type as a discriminated union where nil is the empty string because this is also a valid string. Hence Maybe(string) doesn't make much sense to me. Isn't the empty string a valid string?

I see nil as some kind of additional value for built-in types that have reference-like semantics, i.e. that have internal pointers that may be nil. (hence the point to nothing, uninitialized etc.).

This is not the case for a string which always allocate. (semantics of an immutable byte array even if implemented otherwise)

Said otherwise, built-in types with value semantics are always initialized so cannot be nil.

willfaught commented 2 years ago

@atdiar:

My issue is that it appears difficult to see a string type as a discriminated union where nil is the empty string because this is also a valid string. Hence Maybe(string) doesn't make much sense to me. Isn't the empty string a valid string?

Pointers can be modeled as sum types, but strings don't necessarily have to be. For instance, gc implements strings as a struct of a pointer to memory and a length. It's basically a slice without a capacity, e.g. type string struct {p unsafe.Pointer; len int}. Both the nil string and the empty string (being the same) would just be the zero/default value for that implementation, e.g. struct{p unsafe.Pointer; len int}{}.

I see nil as some kind of additional value for built-in types that have reference-like semantics, i.e. that have internal pointers that may be nil. (hence the point to nothing, uninitialized etc.).

I think I addressed this point with my earlier paragraph about perspectives. Do you have a response to that?

Also, strings do contain pointers, including nil pointers, as I illustrated above in this comment.

Again, would you please explain precisely what you mean by "reference-like semantics" so we're not talking past each other?

This is not the case for a string which always allocate. (semantics of an immutable byte array even if implemented otherwise)

What do you mean by strings always allocate? Slicing a string results in a new pointer and length, using constant space and time. Sure, when strings are being created or appended, strings are being allocated, but what's the connection to nil? Types like channels, maps, and slices are also allocated when created.

Said otherwise, built-in types with value semantics are always initialized so cannot be nil.

Again, I don't know what you mean by this, because all types have value semantics. What do you mean by non-value, or reference, semantics?

atdiar commented 2 years ago

What's the difference between a nil map and an empty map? What's the difference between a nil slice and an empty slice? Where does the internal pointers point to in both cases?

For a string, where does the internal pointer points to? Can this ever be nil?

I think the answer to these questions will help.

A string is always initialized in my understanding. A map, slice, pointer, interface etc. may not be.

atdiar commented 2 years ago

Reference semantics could simply mean that the type holds an aliasable reference to an internal datastructure.

For a slice, it's held in the pointer to the backing array for instance.

AndrewHarrisSPU commented 2 years ago

A string is always initialized in my understanding. A map, slice, pointer, interface etc. may not be.

I think this is correct. "A string may be empty, but not nil."

willfaught commented 2 years ago

@atdiar:

What's the difference between a nil map and an empty map?

I'm assuming you're asking what the answers would be if nil were universal?

A nil map would be the value of an uninitialized map variable:

var m map[T]T

Nil maps would be empty. Not all empty maps would be nil, e.g.

var m = map[T]T{}

What's the difference between a nil slice and an empty slice?

Same difference. Nil slice:

var s []T

Non-nil, empty slice:

var s = []T{}

"Empty" just means len(x) == 0.

Where does the internal pointers point to in both cases?

An uninitialized map variable (*runtime.hmap) is currently (*runtime.hmap)(nil), if I understand correctly. An uninitialized slice variable (runtime.slice) is essentially runtime.slice{unsafe.Pointer(0), 0, 0} (pseudocode for the unsafe pointer, since a 0 literal won't compile for it), if I understand correctly (assuming no calling convention wackiness).

For a string, where does the internal pointer points to? Can this ever be nil?

Either nowhere if it's a nil pointer, or a buffer. An uninitialized string variable already points to nowhere. A nil string variable would as well. Go look at runtime.stringStruct for yourself.

A string is always initialized in my understanding. A map, slice, pointer, interface etc. may not be.

True, but variables might not be initialized. The spec says uninitialized variables for map, slice, etc. types have nil values. See my quotations from the spec in other comments above.

Reference semantics could simply mean that the type holds an aliasable reference to an internal datastructure.

It could mean anything, which is why I asked. It seems by reference semantics, you mean types that contain a pointer, whether that pointer is exposed to the user or not.

For a slice, it's held in the pointer to the backing array for instance.

OK, so that's consistent with your definition, but so is string. In fact, from what I've heard, before shipping v1.0, map types used to not be implemented with pointers; they added it because people almost always used pointers with them. There's nothing "referenced" about Go types in the Go spec. For example, even strings aren't required to be backed by a pointer to a buffer, at least according to the string type section. A conforming Go implementation may copy and allocate for every operation on a string, including reading or copying, from what I understand. These categorizations you're bringing up don't seem to have any basis in the spec, and aren't consistent with how gc implements Go.

You seem to be bumping up against my argument in another comment above about perspectives, so I don't think it'll be productive for us two to further discuss this until you address that point.

atdiar commented 2 years ago

I have the impression that to you, Go is similar to Java. In Go, variables whose types have value semantics are never in an uninitialized state. That includes strings.

Note that a reference type is a computer science notion (that used to exist in Go's documentation). It can't mean anything. string types have value type semantics (the emphasis on semantics was deliberate since the implementation can be confusing: e.g. you seem to have overlooked the rawstring internal function which allocates for every new string. Essentially, the internal pointer for a string is never nil).

Anyway, I could go on but the point is a bit moot if you had the impression that currently, every variable can be uninitialized.

Hope it helps.

Merovius commented 1 year ago

FWIW I'm running into the issue @jimmyfrasche mentioned above, namely that I have a generic type and want to compare it to its zero value:

type Sparse[T any] struct {
    m map[Pos]T
}

func (g *Sparse[T]) Set(p Pos, v T) T {
    // don't store zeros, that just wastes space
    if v == *new(T) { // compiler error: T is not comparable.
        delete(g.m[p])
    } else {
        g.m[p] = v
    }
}

Personally, I didn't think of this before (and didn't see it when it was brought up) and it causes me to re-evaluate my support for a proposal like this.

Though I also thought about the predeclared func iszero(T) bool solution, which would suit me fine.

earthboundkid commented 1 year ago

I strongly support adding zero. My only question is whether it makes sense to use this as an opportunity to also take on the nil interface vs. nil pointer confusion by also adding a new name for nil interface.

Merovius commented 1 year ago

I have no strong opinions about the spelling of the zero value or whether we add a predeclared identifier for the zero value or add an iszero function.

Though I do oppose spelling the zero value nil specifically. As was said elsewhere, I don't think adding more nils will help that confusion.

And I think any spelling of a universal zero value will likely make it slightly worse. Because I think v == zero (or whatever) should work for interfaces as well and if people think an interface with dynamic value nil should compare equal to nil, they'll think an interface with dynamic value of zero should compare equal to zero. So, if anything, that confusion will get more common. The same issue exists if the check is spelled if iszero(someInterface).

IMO, the ship on that has sailed and if anything, we should try and figure out how to clarify that you never want to compare the dynamic value of an interface without knowing its type.

mpx commented 1 year ago

nil is the zero value for pointers and interfaces. Adding a separate universal zero would add duplication. Eg, the following would be equivalent:

Accepting this duplication would also be confusing (perhaps even more). It would only be a convention to use 1 over another (something for style guides and linters to opine about :grimacing: ). Historically, Go has avoided introducing multiple nearly identical concepts.

Properly understanding that nil is the zero value for an interface (and other types) would give people a better mental model.

I find extending nil a little disconcerting, but I suspect over time I would just get used to it and it would no longer be weird - just convenient and useful.

[I was originally in favour of something short like _, but now I'm more concerned with the permanent confusion caused by duplication]

earthboundkid commented 1 year ago

Properly understanding that nil is the zero value for an interface (and other types) would give people a better mental model.

To clarify, are you proposing making nil the zero value for other types or just leaving the problem as is?

It would only be a convention to use 1 over another (something for style guides and linters to opine about 😬 ).

I agree that this is my main reservation about adding zero. I think it won't be a problem in practice though because it's very clear you should use the more specific value in most cases. zero is really just for generics and return values. If you're worried about it, someone could write a linter to fix it and add that to gopls.