cmd/compile: some closures can be optimized to be allocated on stack rather heap

choleraehyq commented 3 years ago

What version of Go are you using (`go version`)?

$ go version
go version go1.15.6 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/cholerae/Library/Caches/go-build"
GOENV="/Users/cholerae/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/cholerae/Documents/gopath/pkg/mod"
GONOPROXY="code.byted.org,git.byted.org"
GONOSUMDB="code.byted.org,git.byted.org"
GOOS="darwin"
GOPATH="/Users/cholerae/Documents/gopath"
GOPRIVATE="code.byted.org,git.byted.org"
GOPROXY="https://goproxy.cn,direct"
GOROOT="/Users/cholerae/Documents/gopath/go"
GOSUMDB="sum.golang.google.cn"
GOTMPDIR=""
GOTOOLDIR="/Users/cholerae/Documents/gopath/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/cholerae/Documents/gopath/go1.15.6/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/lg/ld5t5rss459241qtzmqfp0h80000gn/T/go-build623970395=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

https://play.golang.org/p/cbvFc7e9MbI

What did you expect to see?

functions Test1 and Test2 do totally the same thing, one with "functor" and the other with closure. They can be both zero-allocated.

What did you see instead?

Test2 will always allocate the closure on heap.

randall77 commented 3 years ago

The problem is that RunWithClosure needs to return a closure. It can't allocate it on the stack because its stack frame goes away before the function is executed. We do allocate closures on the stack when we can.

Inlining may fix this particular example, that's #28727 (and #10292). Did you run into this in real code? Is the body of RunWithClosure in the real example small enough to be inlineable?

choleraehyq commented 3 years ago

@randall77 Yes, I run into this in real code, which is very similar to RunWithClosure, its body is small enough. In the real code, RunWithClosure will return a reuseable object and a closure together, to let the caller of RunWithClosure handle the lifetime of the object and put it into the pool to reuse it. Currently, I rewrite RunWithClosure in real code to form like RunWithFunc by hand to prevent allocations.

choleraehyq commented 3 years ago

@randall77 Currently, the compiler will rewrite a closure like that to an anonymous struct, which has two fields. I think the reason why we need to allocate that struct on heap is when the compiler meets a function call like cleanup(), the compiler doesn't know whether cleanup is a function literal or a closure call or a method or something. Thus, it can't be simply rewritten to a method call as I do in RunWithFunc. Don't know if I have a wrong understanding.

randall77 commented 3 years ago

I don't think this really has to do with the call to cleanup. We can and do allocate closures (those two field anonymous structs) on the stack and pass pointers to them to called functions. The issue here is that that struct is allocated inside RunWithClosure, and RunWithClosure's frame is gone by the time we call cleanup. So we can't allocate the closure in RunWithClosure's frame. It has to be allocated on the heap.

If we inline RunWithClosure into its caller, then we can allocate the closure in the caller, because that stack frame lives long enough.

the compiler doesn't know whether cleanup is a function literal or a closure call or a method or something

The compiler does know that it is a closure call.

Yes, I run into this in real code, which is very similar to RunWithClosure, its body is small enough.

So should we close this as a dup of #28727 ?

choleraehyq commented 3 years ago

I mean we cannot return a copy of closures (those two field anonymous structs) rather than a pointer, is because the compiler cannot figure out whether it's a normal function pointer or a copied closure struct and handle it specially. But I fully understand your meaning. I'll close this issue.

randall77 commented 3 years ago

Indeed. Furthermore, we can't return closures by value because they aren't constant sized. The caller has no way to know how big the closure allocated by the callee was, so it can't allocate space in its stack frame for it.

func f(x, y int) func() int {
    if ... {
        return func() int { return x }
    }
    return func() int { return x + y }
}
func g() int {
    x := f(3, 5)
    return x()
}

If f returned its closure by value, how would g know how big it is?

The former closure is 2 words in size (function entry point, the value of x), the latter is 3 words in size (function entry point, the values of x and y).

(I guess we could do interprocedural analysis to figure out when it is constant sized, and pass by value in that case. But even then it is tricky because the caller needs to know the pointerness of each entry.)

golang / go