golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.28k stars 17.7k forks source link

cmd/compile: duplicate compilation of methods of generic types causes extraordinarily slow build times and high RAM usage #70511

Open arvidfm opened 4 days ago

arvidfm commented 4 days ago

Go version

go version go1.23.3 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/[...]/go-build'
GOENV='/home/[...]/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/[...]/go/pkg/mod'
GONOPROXY='[...]'
GONOSUMDB='[...]'
GOOS='linux'
GOPATH='/home/[...]/go'
GOPRIVATE='[...]'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/[...]/sdk/go1.23.3'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/[...]/sdk/go1.23.3/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.3'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/[...]/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/[...]/mwe_slow_generics/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build89512579=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Compile the following code with go build -a -gcflags 'play.ground/...=-d=ssa/trim/time':

package main

import "play.ground/baz"

func main() {
    baz.Baz{}.BazFunc()
}
-- go.mod --
module play.ground
-- foo/foo.go --
package foo

type Foo[A, B any] struct{}

func (Foo[A, B]) FooFunc() {}
-- bar/bar.go --
package bar

import "play.ground/foo"

type Bar struct{}

func (Bar) BarFunc()                     {}
func (Bar) BarFooFunc(foo.Foo[Bar, int]) {}
-- baz/baz.go --
package baz

import "play.ground/bar"

type Baz struct{}

func (Baz) BazFunc()           {}
func (Baz) BazBarFunc(bar.Bar) {}

I also have a more complex MWE showing how e.g. embedding can exacerbate the issue.

What did you see happen?

All methods of Foo[Bar, int] (i.e.FooFunc and its autogenerated wrapper methods) get instantiated/recompiled in both bar, baz and main, even though they were already compiled in bar, and neither baz or main directly reference Foo, nor do they call methods that make use of the type:

# play.ground/bar
<autogenerated>:1:      trim    TIME(ns)        112     (*Foo[play.ground/bar.Bar,int]).FooFunc
<autogenerated>:1:      trim    TIME(ns)        105     (*Bar).BarFooFunc
<autogenerated>:1:      trim    TIME(ns)        95      (*Foo[go.shape.struct {},go.shape.int]).FooFunc
<autogenerated>:1:      trim    TIME(ns)        99      (*Bar).BarFunc
bar/bar.go:7:43:        trim    TIME(ns)        86      Bar.BarFunc
bar/bar.go:8:43:        trim    TIME(ns)        87      Bar.BarFooFunc
./foo/foo.go:5:6:       trim    TIME(ns)        102     Foo[play.ground/bar.Bar,int].FooFunc
./foo/foo.go:5:29:      trim    TIME(ns)        128     Foo[go.shape.struct {},go.shape.int].FooFunc
# play.ground/baz
<autogenerated>:1:      trim    TIME(ns)        128     (*Baz).BazBarFunc
<autogenerated>:1:      trim    TIME(ns)        35      (*Foo[go.shape.struct {},go.shape.int]).FooFunc
<autogenerated>:1:      trim    TIME(ns)        49      (*Baz).BazFunc
baz/baz.go:7:33:        trim    TIME(ns)        55      Baz.BazFunc
baz/baz.go:8:33:        trim    TIME(ns)        28      Baz.BazBarFunc
./foo/foo.go:5:6:       trim    TIME(ns)        64      Foo[play.ground/bar.Bar,int].FooFunc
./foo/foo.go:5:29:      trim    TIME(ns)        27      Foo[go.shape.struct {},go.shape.int].FooFunc
# play.ground
<autogenerated>:1:      trim    TIME(ns)        122     (*Foo[go.shape.struct {},go.shape.int]).FooFunc
./main.go:9:1:  trim    TIME(ns)        58      main
./foo/foo.go:5:6:       trim    TIME(ns)        58      Foo[play.ground/bar.Bar,int].FooFunc
./foo/foo.go:5:29:      trim    TIME(ns)        25      Foo[go.shape.struct {},go.shape.int].FooFunc

As a result of this, after the introduction of generics to our codebase, we are seeing build times up from a few seconds to over four minutes on clean builds or after changes to core code, with unit tests that previously took a few minutes sometimes taking up to an hour to compile and run. The way the generic types "infect" any packages that indirectly depend on them means that adding a single new method to a generic type results in an O(nm) build time increase for n instantiations of the type and m packages that indirectly depend on them, which can easily be roughly quadratic - even if the method is never actually called!

In our case some methods are getting recompiled 40-50 times across as many packages for the same type parameters. As an example, for one method I count 28,790 invocations of ssa.Compile, but only 4,434 unique symbol names. (Note that these numbers also include associated autogenerated wrapper functions that seem to comprise about 90% of the symbols, inflated due to struct and interface embedding; there are only about 300-400 unique instantiations of the type that owns the methods.)

If it's helpful, I've also collected profiling data from running the compiler on a few packages that are particularly slow to compile:

In general, it seems that we are hitting a pathological case, where a number of factors contribute to how many functions (and autogenerated wrappers) are generated for each instantiation, and thus how severely we are hit by this issue:

What did you expect to see?

Foo[Bar, int] to only be instantiated once, presumably in the bar package:

# play.ground/bar
<autogenerated>:1:      trim    TIME(ns)        150     (*Foo[play.ground/bar.Bar,int]).FooFunc
<autogenerated>:1:      trim    TIME(ns)        73      (*Bar).BarFooFunc
<autogenerated>:1:      trim    TIME(ns)        60      (*Foo[go.shape.struct {},go.shape.int]).FooFunc
<autogenerated>:1:      trim    TIME(ns)        39      (*Bar).BarFunc
bar/bar.go:7:43:        trim    TIME(ns)        51      Bar.BarFunc
bar/bar.go:8:43:        trim    TIME(ns)        42      Bar.BarFooFunc
./foo/foo.go:5:6:       trim    TIME(ns)        63      Foo[play.ground/bar.Bar,int].FooFunc
./foo/foo.go:5:29:      trim    TIME(ns)        28      Foo[go.shape.struct {},go.shape.int].FooFunc
# play.ground/baz
<autogenerated>:1:      trim    TIME(ns)        133     (*Baz).BazBarFunc
<autogenerated>:1:      trim    TIME(ns)        60      (*Baz).BazFunc
baz/baz.go:7:33:        trim    TIME(ns)        57      Baz.BazFunc
baz/baz.go:8:33:        trim    TIME(ns)        27      Baz.BazBarFunc
# play.ground
./main.go:9:1:  trim    TIME(ns)        55      main
gabyhelp commented 4 days ago

Related Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

arvidfm commented 4 days ago

Note that this issue is related to #56718, but I figured that there was value in having an issue framed from a user's point of view, to show what the actual consequences of the issue is. (I also wanted somewhere to describe my findings.)

It may be the same as #65605, but that issue is less specific. Perhaps this issue can supersede it.

See also #50438, which this issue exacerbates.

timothy-king commented 4 days ago

Thank you for the detailed report. I suspect this is a manifestation of #56718.