golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.07k stars 17.55k forks source link

cmd/compile: avoid always escaping the receivers in interface method calls #62653

Open thepudds opened 1 year ago

thepudds commented 1 year ago

Background

Currently, interface arguments to functions frequently escape due to subsequent use of the interface in an interface method call.

This affects many APIs, including things like marshalling/unmarshalling APIs, logging APIs, things like fmt.Sprintf, and many other flavors of APIs that take interfaces.

For example, in Go 1.21, the val here escapes and is heap allocated for this reason (and other for reasons as well):

 val := 1000
 fmt.Sprintf("%d", val)

If we consider an extremely simplified implementation of Sprintf:

  func Print(input any) {
    if v, ok := input.(Stringer); ok {
       println(v.String())
    }
  }

When v.String is called as an interface method in that example, v might contain a type like Leaking:

var global any
type Leaking struct {a, b int}

 // The receiver l cannot be stack allocated because String leaks l to a global variable.
func (l *Leaking) String() string { global = l; return "" } 

The current compiler knows this is possible, and as a result, the input interface argument to our simple Print function is conservatively marked as escaping. This is part of the reason the real fmt.Sprintf causes val to be allocated above.

Frequently, though, v contains a type like Nice:

type Nice struct {a, b int}

// The receiver n could in theory be stack allocated in n := Nice{}; Print(n)
func (n Nice) String() string { return "something" } 

Suggestion

The suggestion is for escape analysis to propagate what it knows about the use of interfaces in method calls to then prove (when it can) whether it is dealing with a type like Leaking or Nice to then avoid allocating the n in n := Nice{}; Print(n).

In particular, escape analysis has had the concept of pseudo locations for things like the heap, and the escape analysis data-flow graph in the current compiler contains an edge leading to a heap pseudo location for each interface method call observed.

We could instead track the interface method use flows separately from other flows to the heap, including propagating across function call boundaries by tagging the associated function & method parameters. When a concrete value is later converted to an interface type (e.g., Print(n) in our simple example above), we look at what we know about the type and its methods to see if we can prove that it is incapable of leaking if used as a method receiver in an interface method call.

I sent CL 524945 with implementation of the idea, along with some related CLs like 524937 and 524944.

Those are a part of a larger stack targeting the fmt print functions. By the end of my stack (as of CL 528538), this no longer allocates the Point struct on the heap:

  type Point struct {x, y int}
  p := Point{1, 2}
  fmt.Sprintf("%v", p)

The primary CL 524945 is still WIP, but at least passes all.bash and the TryBots, including passing the more specific interface receiver tests I added. (It passes the older TryBots, but currently fails the new LUCI TryBots for LUCI-specific reasons). The first cut does not differentiate between a type that only has some methods with leaking receivers vs. the specific method in question, but CL 528539 is also a small refinement I had also been thinking about that attempts to address that. (That is not needed for many cases, and that CL is currently in much rougher shape & more exploratory).

In the initial discussion in Gerrit on CL 524945, Matthew Dempsky said he liked the idea. He also suggested opening an issue to help discussion, which is now this issue.

Matthew also later sent CL 526520 with an alternative implementation of the core idea.

CC @mdempsky, @ianlancetaylor

heschi commented 1 year ago

cc @golang/compiler

aclements commented 1 year ago

Nice work!

Also cc @dr2chase , who looked into escape analysis for receivers in interface calls a few years back.

dr2chase commented 1 year ago

Nice, I hadn't thought of the "does this type have ANY escaping methods" approach, d'oh. And yes for printing it is also necessary to tackle reflection; that was where my experiment left the rails and I decided (treating reflection as an intractable black box) that this was not going to be easy to solve. So, excellent, good work, thank you!

thepudds commented 1 year ago

Hi @aclements and @dr2chase thanks for the feedback!

And yes for printing it is also necessary to tackle reflection

FWIW, this issue here I think might be 1 of ~6 reasons the fmt print arguments allocate. Yesterday, I tried to document the ~6 reasons in https://github.com/golang/go/issues/8618#issuecomment-1720049443.

thepudds commented 10 months ago

FWIW, as part of working on this, I wanted to see if it would be reasonable to expand some of the ways escape analysis is tested in the hopes of allowing changes to be made to it with more confidence.

One piece is trying to actually execute almost all the escape analysis tests that live in the top-level 'test' directory, which is a treasure trove of escape analysis corner cases that today are mostly validated by comparing log output and not executed today. (Sent CL 543555, and I have some follow-on WIP pieces not yet sent).

Another piece that is intended to dovetail with that is trying to expand the dynamic checks that the runtime does to potentially better detect mistakes in escape analysis. (For example, sent CL 543935 and CL 543936, with 543935 mostly an updated version of @cherrymui's earlier unmerged CL 408827; 543936 is still exploratory).

One question I have is if it would be reasonable to define that GODEBUG=invalidptr=2 means additional checks, while the default remains today's checks with a default value of invalidptr=1? Is that something that could be considered an "internal" change, vs. is that something that would need to be publicly documented and/or go through a proposal review? (For cmd/compile, it seems somewhat common to add debug flags that enable extra checks or disable something, but I'm less sure of the protocol for the runtime).

cherrymui commented 10 months ago

With GODEBUG=invalidptr=2, I think we can enable it by default (assuming the run-time overhead is negligible). That gives users a chance to temporarily turn it back to invalidptr=1 to unblock updating Go toolchain and to debug.