golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.48k stars 17.59k forks source link

proposal: spec: add Nullable constraint #53656

Open MrTravisB opened 2 years ago

MrTravisB commented 2 years ago
func foo[T any](param T) {
    if param == nil {
        // do a thing
    }
}

This is invalid

invalid operation: cannot compare param == nil (mismatched types T and untyped nil)

Would be nice to have a nullable constraint so that the following could be done

func foo[T Nullable](param T) {
    if param == nil {
        // do a thing
    }
}
seankhliao commented 2 years ago

where would this be useful?

randall77 commented 2 years ago

Would this work for you?

func foo[T comparable](param T) {
        var x T
    if param == x {
        // do a thing
    }
}
earthboundkid commented 2 years ago

Nilable is a bad constraint because nil is overly broad. Forgive me for not searching for issue numbers, but there's already a long issue about splitting nil into nilinterface vs nilptr vs nilchan, nilslice, and nilmap. Right now there's no constraint that can guarantee that T is an interface. That would be more helpful.

More generally, the issue to add a built in zero should be adopted. You can compare functions to nil but not each other.

For now you can work around this by writing a func something[T any](t T, iszero func(T) bool). Most constraints can be replace with a callback that does whatever the constraint is meant to enforce.

Jorropo commented 2 years ago

Would this work for you?

func foo[T comparable](param T) {
        var x T
  if param == x {
      // do a thing
  }
}

I have a case where no, in my case T is an interface, it is not comparable then, but it is nil-able. (if I try to use this I obviously get <symbol> does not implement comparable)

earthboundkid commented 2 years ago

There is also an issue to make interfaces comparable. https://github.com/golang/go/issues/52624 We got issues for everything. :-)

Here is adding more specific nils: #22729

Here is adding a universal zero: https://github.com/golang/go/issues/53666 Edit, that's the dup. Original is #35966.

Again, I think most constraints can be replaced with callbacks, but even without doing that a Nilable constraint is too broad and an Interface constraint would be more useful (although it really confuses the question of what a constraint is, since interfaces can't contain interfaces).

atdiar commented 2 years ago

Maybe the issue is that every type should be comparable to its zero value? (whether they satisfy comparable or not) In which case it does not need an additional specific constraint but it requires a notation for the zero value?

Jorropo commented 2 years ago

@carlmjohnson

Again, I think most constraints can be replaced with callbacks

But but performance ? (replacing a single compare instruction by a virtual call seems silly)

an Interface constraint would be more useful

:+1: Is there an issue for the virtual constraint yet ?

Jorropo commented 2 years ago

@atdiar

In which case it does not need an additional specific constraint but it requires a notation for the zero value?

I think you are asking for #53666.

atdiar commented 2 years ago

@atdiar

In which case it does not need an additional specific constraint but it requires a notation for the zero value?

I think you are asking for #53666.

Yes, kind of. The idea is there but the zero value of a string is not nil for example. So I wouldn't call it that way.

Since every type is comparable to its zero if I'm not mistaken, generic code would not need a constraint for if T == zero(T) {...}

Jorropo commented 2 years ago

if T == zero(T) {...}

I don't think this syntax is obvious enough, I think a:

func isZero[T any](v T) bool

builtin would be better

(It would not be a real function, the compiler would replace thoses calls with compares to 0 of the right size.)

PS: it would be different from reflect.ValueOf(v).IsZero() because it would target the passed type, not the underlying type. In other words this:

emptyString := ""

var interfaceContainingTheEmtpyString any = emptyString

fmt.Println(isZero(interfaceContainingTheEmtpyString)) // false, because it isn't a nil interface
fmt.Println(reflect.ValueOf(interfaceContainingTheEmtpyString).IsZero()) // true, because reflect use the assertion rules and look at the underlying type, and "" is the zero value of the empty string.
atdiar commented 2 years ago

Well, we need a zero(T) anyway since we need to be able to return the zero value in generic functions that require it.

Jorropo commented 2 years ago

Fair. However we can already do this using @randall77 suggestion:

var zero T
return zero

You can also use named return values and not assign to them, ...

earthboundkid commented 2 years ago

func isZero[T any](v T) bool has to be written this way (notice the pointer and Elem()), so that it works with interface types:

func isZero[T any](v T) bool {
   return reflect.ValueOf(&v).Elem().IsZero()
}

I've used this code in some libraries. isZero("") is significantly slower than s == "", from ~1ns/op to ~20ns/op, plus it can escape to the heap. I would prefer if https://github.com/golang/go/issues/35966 were accepted instead.

Jorropo commented 2 years ago

I've used this code in some libraries. isZero("") is significantly slower than s == "", from ~1ns/op to ~20ns/op, plus it can escape to the heap.

This is because the compiler doesn't know how to do constant folding / rewrites on reflection even if the types are known at compile time. I was assuming we would implement that if this was the choosen solution. (which would be a nice thing anyway because we could make compile time known reflection (like from inlined functions) faster)

atdiar commented 2 years ago

Fair. However we can already do this using @randall77 suggestion:

var zero T
return zero

You can also use named return values and not assign to them, ...

Indeed. The only issue is in the interaction of the two issues (zero value notation and zero value comparison).

In @randall77 suggestion, T satisfies comparable.

But it should be possible to compare to the zero value without this constraint. If we decide to do it with var zero T, the compiler needs to make sure that zero hasn't changed.

ianlancetaylor commented 2 years ago

@MrTravisB It would help a lot to see cases where this is actually useful. In order to find the most useful solution, it helps to know what the real problem is. For example, right now it's possible to write a type constraint that covers a set of types that can be compared to nil, but it's hard to actually do anything with that set of types. It's not clear why we should focus on comparisons to nil, when you still won't be able to do anything else with the type parameter. Or, if you can do something else, what is it that you can do? Thanks.

AndrewHarrisSPU commented 2 years ago

Fair. However we can already do this using @randall77 suggestion:

var zero T
return zero

You can also use named return values and not assign to them, ...

Indeed. The only issue is in the interaction of the two issues (zero value notation and zero value comparison).

In @randall77 suggestion, T satisfies comparable.

But it should be possible to compare to the zero value without this constraint. If we decide to do it with var zero T, the compiler needs to make sure that zero hasn't changed.

Is the instance I have in a variable well-initialized enough to reliably support the method set of its type without crashing? I feel like 'nil' coincides with 'no, don't use for any computation'. Does some built-in 'zero' mean 'yes, it's empty, but it's also well-initialized'? That's my (possible mis-) reading of discussions so far. I worry that with var zero T, it's neither a clear yes nor a clear no.

Jorropo commented 2 years ago

I was writing some lazy initializing generic code. It was a function similar to this:

type Lazy[T any] struct {
    v T
}

func (l *Lazy[T]) Get(create func() (T, error)) (r T, err error) {
    r = l.v
    if r == nil {
        r, err = create()
        l.v = r
    }
    return
}

The actual code is threadsafe using atomics and a mutex to not race the creations attempts so it actually made sense to package in a function. I actually resorted to using *T since we sadly can't atomically store or load interfaces.

ianlancetaylor commented 2 years ago

@Jorropo Thanks. Still, your code seems to assume that the zero value of the type is not a valid value; if you can capture create in a function, why not capture valid in a function? Also that particular code could be written more correctly and safely using sync.Once. So it's helpful but not entirely convincing.

earthboundkid commented 2 years ago

Moving off the original topic, but I previously wrote a generic lazy initializer using sync.Once. Code is here: https://github.com/carlmjohnson/syncx/blob/main/once.go

Jorropo commented 2 years ago

Still, your code seems to assume that the zero value of the type is not a valid value

Yes, in my case it's true, any of my consumers would panic nil deref if they tried to use a zero value.

why not capture valid in a function?

True I could. It feels silly to replace a single compare with a virtual call. (which isn't enough to justify such proposal but I guess this is one of many usecases)

Also that particular code could be written more correctly and safely using sync.Once. So it's helpful but not entirely convincing.

There is no way to reset a sync.Once if it failed. (actually a reset isn't even what I need here, this would be racy, I would need a sync.Once.Do(*sync.Once, func() bool) which doesn't complete and retry a new execution if false is returned) In my case the creator returns a nil value if the initialization failed, allowing you to do an other attempt later. A more correct version of this code would be:

type Lazy[T nilable] struct {
    v T
}

func (l *Lazy[T]) Get(create func() (T, error)) (r T, err error) {
    r = l.v
    if r == nil {
        r, err = create()
        if err != nil {
            return nil, err
        }
        l.v = r
    }
    return
}
MrTravisB commented 2 years ago

My use case was very similar to @Jorropo in that I was trying to lazy set a generic value that would either be a function or an interface value. Checking it for a zero value isn't the only issue though. Being able to reset to zero is also an issue.

type Lazy[T nilable] struct {
    v T
}

func (l *Lazy[T]) Reset() {
    l.v = nil
}
andig commented 2 years ago

I‘m still confused as to why the original code is invalid. If T can be any type it may very well be a nil interface? If thats possible why shouldn‘t it be comparable with nil?

ianlancetaylor commented 2 years ago

@andig In the approach we've taken for Go generics you can only perform operations that are permitted for all possible type arguments, not operations that are permitted for a subset of the possible type arguments.

atdiar commented 2 years ago

Fair. However we can already do this using @randall77 suggestion:

var zero T
return zero

You can also use named return values and not assign to them, ...

Indeed. The only issue is in the interaction of the two issues (zero value notation and zero value comparison). In @randall77 suggestion, T satisfies comparable. But it should be possible to compare to the zero value without this constraint. If we decide to do it with var zero T, the compiler needs to make sure that zero hasn't changed.

Is the instance I have in a variable well-initialized enough to reliably support the method set of its type without crashing? I feel like 'nil' coincides with 'no, don't use for any computation'. Does some built-in 'zero' mean 'yes, it's empty, but it's also well-initialized'? That's my (possible mis-) reading of discussions so far. I worry that with var zero T, it's neither a clear yes nor a clear no.

I didn't reply right away to let the main discussion continue but I think that it should be no. The zero value of a type shouldn't add semantics that are usually the pregorative of a constructor function, I believe.

AndrewHarrisSPU commented 2 years ago

I didn't reply right away to let the main discussion continue but I think that it should be no. The zero value of a type shouldn't add semantics that are usually the pregorative of a constructor function, I believe.

I agree a definition of a zero value is straight out of the spec. A constructor function is nowhere in the spec, and that's well-motivated. Instead there's a range of variant behaviors such that for some given T:

My feeling is that the ergonomics here were in a very sweet spot for pre-generics Go. The ergonomics when writing generic Go are less sweet. (Also, my sense is that there's a number of issues and discussions here and elsewhere that can be indirectly related)

Coarsely, I wonder if there'd be any value in distinguishing between "naive" and "enlightened" generic types and routines. For "naive" types, we'd assume an infallible, empty constructor (i.e., numerics, slices/maps/channels, structs exclusively composed thereof, etc.), and infallible (if allocating) construction with make() or new() is possible/preferred in code - there's no uniform way to get this now.

"Enlightened" types would be those that require particular methods or callbacks to do the equivalent thing. Significantly, however, there's no bound or general description on "enlightenment"; an enlightened T may require initialization beyond Duff's device, may require acquire/release semantics, or really infinite elaborations. "Enlightened" is a category, not a constraint.

If there were a way to say that T is constrained to be naive, type-checking could fail where an enlightened T is given. Naive T types can cover a lot of reasonable uses of generics, but enlightened types seem essential as well. Purely from an ergonomics standpoint, would it help clarify where and why some of the enlightened routines need constraints or callbacks? Or why there are a lot of wrinkles to think about with nils and zero values etc.?

atdiar commented 2 years ago

If I understand you well, the issue is that outside of generics, we somehow know whether the zero value is directly usable or not. For a type parameter however, we don't necessarily know; it may vary depending on the type argument?

Hmmh, I think you have a point. I hadn't thought about that.

cespare commented 1 year ago

Here's a simplified version of a real use case I just came across at work: https://go.dev/play/p/jLoFc0CAPdV.

earthboundkid commented 1 year ago

I have a helper library that can test if those functions are nil, but it uses reflect under the hood. I think it’s another argument for adding the universal zero built in.

ajzaff commented 1 year ago

One use case I came across in implementing a JSONConstraint:

Per json.Unmarshal documentation:

To unmarshal JSON into an interface value, Unmarshal stores one of these in
    the interface value:

        bool, for JSON booleans
        float64, for JSON numbers
        string, for JSON strings
        []interface{}, for JSON arrays
        map[string]interface{}, for JSON objects
        nil for JSON null

    ...

    The JSON null value unmarshals into an interface, map, pointer, or slice
    by setting that Go value to nil. Because null is often used in JSON to mean
    “not present,” unmarshaling a JSON null into any other Go type has no effect
    on the value and produces no error.

One could add a JSONConstraint like:

type JSONConstraint interface {
  bool | float64 | string | []any | map[string]any 
}

Which can be used like, for instance:

func NewJSONData[T JSONConstraint](x T) { ... }

But this leaves out null JSON value. Perhaps that's fine as when null equates to "key not present".

Another potential solution is adding a Null flag value to the constraint.

type Null any

type JSONConstraint interface {
  bool | float64 | string | []any | map[string]any | Null
}
gannett-ggreer commented 1 month ago

The JSON "key not present" case is similar to what we're having an issue with in using the https://github.com/graph-gophers/dataloader Loader interface. We can't detect a missing object by comparing with zero value because interface types don't have a zero value and we can't detect a missing object by comparing with nil because concrete types can't be that, and making everything a pointer internally means pointer-to-interface or pointer-to-pointer fun. We can't change the API because it isn't ours to change and also because the return value we use everywhere is a slice of the type values. We'd rather support interface types than value types so having a way to say nil is a valid value for the generic type with constraint (interface or pointer) would be useful.