golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.24k stars 17.7k forks source link

proposal: go/types: export API for unification #63982

Open dominikh opened 1 year ago

dominikh commented 1 year ago

go/types should offer API for type unification, to allow clients to unify types that the type-checker hasn't already unified while type-checking code.

Staticcheck has a use for this as part of the unused check, to see if types implement possible instantiations of generic interfaces. For example, given

type Iface[T any] interface {
    foo() []T
    bar() T
}

type T1 struct{}
func (T1) foo() []int
func (T1) bar() int

type T2 struct{}
func (T2) foo() []int
func (T2) bar() string

we want to find out that T1 implements Iface[int] and that T2 doesn't implement any possible instantiation of Iface. For that, we would try to unify T1.foo and T1.bar with Iface.foo and Iface.bar, under a shared mapping of type parameters, and similarly do the same for T2.

As such, the API should allow unifying multiple pairs of types using the same mapping of type parameters to types, report success/failure, and allow querying the mapping.

/cc @findleyr

findleyr commented 1 year ago

CC @griesemer @adonovan @mdempsky

I think Robert and I had discussed this, at some point. This would be a rather invasive API. An initial, naive thought: I wonder how much of this can currently be achieved with e.g. types.CheckExpr?

I feel like over the past 6 months we've run into 3-4 different new APIs ideas that could almost (but not quite) be implemented with CheckExpr (most recently during the inliner work). Perhaps we should be instead working on a new CheckExpr API that is more useful and reliable. For example: one thing we've wanted is to be able to supply a synthetic scope.

griesemer commented 1 year ago

I think the API here could be made relatively clean if it simply takes two types, a map of type parameters to type arguments, and then unifies for type identity. Such an API would ignore type constraints or assignability, nor would it do any renaming.

findleyr commented 1 year ago

@griesemer interesting. I guess the question is whether this is an interface for unification, or inference. (nobody suggested an API for inference, but that is what I was thinking about).

Unification is relatively clean, but inference is not. Is a unification API sufficient? It is for the use-case suggested by @dominikh.

findleyr commented 7 months ago

This came up during the tools call. If we can come up with a natural API that allows for future extension, I think we could add this for 1.23 or 1.24.

For discussion, here's are two possible shapes for the API:

  1. A Unifier that holds inferred arguments and (potentially) other configuration
    
    type Unifier struct {
    TypeMap map[*TypeParam]Type
    }

func (u *Unifier) Unify(x, y Type) bool


2. A standalone Unify function

func Unify(x, y Type, typeMap map[*TypeParam]Type) bool



I think I'd prefer (1), as it makes it easier to perform multiple unifications in the same context, and allows for more future extension. For example, we may want to add a field on Unifier to be used for reporting more detailed errors.

CC @adonovan 
timothy-king commented 7 months ago

IIUC the proposed use case of Unify (PR) is that Unify(...) == false proves that a callsite is not a specific callee (basically CHA). I don't think it necessarily follows that Unify(...) == true implies that there there exists some instantiation of the type parameters such that this call is possible.

A good minimum bar to clear is "would we want such call edges in codesearch?" Unify might clear that bar.

it makes it easier to perform multiple unifications in the same context

It would be helpful to have other candidate usages. The current candidate usage works on a pair of types at a time: can I be a T? This would not need to share work if one asks at the right granularity. It also probably does not use typeMap.

Would it be more helpful to give an error explaining why Unify failed if it does? Failures will be hard to explain outside of the type checker. Maybe explaining why a function cannot be called would be a helpful feature for an IDE?

dominikh commented 7 months ago

Just to make sure we're on the same page:

proves that a callsite is not a specific callee (basically CHA).

My use case (which is indeed related to the PR you've linked to) is similar to CHA, but operates entirely on types and doesn't depend on individual call sites. It aims to very conservatively answer whether methods of a given type can possibly contribute to an interface being implemented, while looking at a single package in isolation.

What follows is a more explicit example and explanation, feel free to skip it if we're on the same page.

To be more explicit: Given the code from my original comment, I would call Unify twice, once with the types of Iface.foo and T1.foo and once with the types of Iface.bar and T1.bar, using a shared type map, to test whether T1 is an implementation of any instantiation of Iface. Both calls to Unify returning true would tell me that T1 is an implementation of an instantiation of Iface and the mapping would tell me that the instantiation of Iface is Iface[int]. Doing the same with T2 instead of T1 would fail to unify, telling me that T2 is not an implementation of any instantiation of Iface.

This information is important to me because of this extended example:

type Iface[T any] interface {
    foo() []T
    bar() T
}

type T1 struct{}

func (T1) foo() []int
func (T1) bar() int

type T2 struct{}

func (T2) foo() []int
func (T2) bar() string

func Doer[T any](x Iface[T]) {
    x.foo()
}

The unused analysis in Staticcheck needs to determine that T1's methods could be relevant if the user of this package calls Doer[int](T1{}), but the same isn't possible for T2.

It is the generic variant of this case, which unused already handles:

type Iface interface {
    foo() []int
    bar() int
}

type T1 struct{}

func (T1) foo() []int { return nil }
func (T1) bar() int   { return 0 }

type T2 struct{}

func (T2) foo() []int  { return nil }
func (T2) bar() string { return "" }

func Doer[T any](x Iface) {
    x.foo()
}

Currently, for the non-generic version, Staticcheck reports

foo.go:15:11: func T2.foo is unused (U1000)
foo.go:16:11: func T2.bar is unused (U1000)

while for the generic version it incorrectly reports

foo.go:10:11: func T1.foo is unused (U1000)
foo.go:11:11: func T1.bar is unused (U1000)
foo.go:15:11: func T2.foo is unused (U1000)
foo.go:16:11: func T2.bar is unused (U1000)

The key difference between unused's approach and that of callgraph/*, as far as I understand, is that unused can operate on individual packages in isolation, at the cost of being even more conservative.

(We also consider Iface.bar to be used, despite not ever being called, because it might be a marker method to further restrict the set of types that implement the interface, a la an isExpr() method.)

timothy-king commented 7 months ago

I think we are on the same page. AFAIU the key difference with CHA and usage is mostly the set of type pairs (T,I) considered for implements(T, I). Not the implements predicate. (Though usage may be more conservative with implements in the presence of buildtags. I'm not sure.)

jacobzim-stl commented 1 day ago

I've been implementing type inference for generics in gopls. Because no public API was available I had to vendor the internal unification library. I'm leaving a comment here as a future reminder to delete gopls/internal/golang/completion/unify.go when this goes live.

It should only require updating a couple lines in gopls/internal/golang/completion/completion.go when finished.