golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.84k stars 17.51k forks source link

cmd/compile: profile-guided optimization #55022

Closed cherrymui closed 1 year ago

cherrymui commented 2 years ago

We propose adding support for profile-guided optimization (PGO) to the Go gc toolchain. PGO will enable the toolchain to perform application- and workload-specific optimizations based on run-time information. Unlike many compiler optimizations, PGO requires user involvement to collect profiles and feed them back into the build process. Hence, we propose a design that centers user experience and ease of deployment and fits naturally into the broader Go build ecosystem.

Detailed design can be found here.

In summary, we propose

Input welcome. Beyond input on the general approach, we're particularly looking for input on whether PGO should be default enabled in 1.20, and flag and file naming conventions.

If accepted, we plan to implement a preview of PGO in Go 1.20. Raj Barik and Jin Lin plan to contribute their work on the compiler implementation (initial prototype in CL 429863, but not yet following proposed API).

See also previous discussions at #28262. Filing a new issue to make it clearer what we are proposing. Thanks.

cc @aclements @prattmic @rajbarik

gopherbot commented 1 year ago

Change https://go.dev/cl/444557 mentions this issue: sweet: add mode for PGO benchmarking

gopherbot commented 1 year ago

Change https://go.dev/cl/444895 mentions this issue: sweet: rebuild cmd/compile and cmd/link for go-build benchmark

gopherbot commented 1 year ago

Change https://go.dev/cl/446300 mentions this issue: cmd/compile/internal/test: clean up TestPGOIntendedInlining

gopherbot commented 1 year ago

Change https://go.dev/cl/446302 mentions this issue: cmd/compile/internal/pgo: remove ConvertLine2Int

gopherbot commented 1 year ago

Change https://go.dev/cl/446303 mentions this issue: cmd/compile/internal/pgo: remove most global state

gopherbot commented 1 year ago

Change https://go.dev/cl/446755 mentions this issue: cmd/compile/internal/pgo: remove ListOfHotCallSites

gopherbot commented 1 year ago

Change https://go.dev/cl/447016 mentions this issue: cmd/compile: use CDF to determine PGO inline threshold

gopherbot commented 1 year ago

Change https://go.dev/cl/447015 mentions this issue: cmd/compile: use edge weights to decide inlineability in PGO

gopherbot commented 1 year ago

Change https://go.dev/cl/446977 mentions this issue: cmd/compile/internal/pgo: unexport local types and fields

gopherbot commented 1 year ago

Change https://go.dev/cl/446976 mentions this issue: cmd/compile/internal/inline,cmd/compile/internal/pgo: remove candHotNodeMap and candHotEdgeMap

gopherbot commented 1 year ago

Change https://go.dev/cl/447135 mentions this issue: runtime: yield in goschedIfBusy if gp.preempt

gopherbot commented 1 year ago

Change https://go.dev/cl/447315 mentions this issue: cmd/compile/internal/pgo: match on call line offsets

gopherbot commented 1 year ago

Change https://go.dev/cl/447997 mentions this issue: DO NOT SUBMIT: tool for parallel PGO testing on gomotes

gopherbot commented 1 year ago

Change https://go.dev/cl/449500 mentions this issue: DO NOT SUBMIT: experiemntal pprof trimming tool

lkarlslund commented 1 year ago

Excellent discussion here, and this initiative is really great overall. Reading the above, I think @klauspost has some very valid concerns, and I know how much effort he puts into making things perform really well.

I think that some pragmas (forced inlining hints) would be a huge help for package maintainers, and looking at older discussions it was denied with the argument that you could just "copy and paste code" to do the inlining yourself, which to me sounds absurd. Defining functions and methods is about clarity and human understanding, a layer that is removed at compile time, to make efficient execution the primary concern. There is a clash here, which could be improved upon.

My use case is primarily as app developer, but I have one or two modules that others use. I often do profiling of my primary application, which does some heavy reading, decompression, interning and correlations ... but that's just one profile. Another run might be to create data, not to read it. So are multiple profiles supported, and how are they prioritized?

For module developers, there is no reason not to think that they are actually the specialists on what performs or not, and not allowing them to "chip in" with knowledge (either by manual inline pragma hints or by supplying profiling data to be merged at compile time) would be a shame.

cherrymui commented 1 year ago

So are multiple profiles supported, and how are they prioritized?

Sorry, I'm not sure I really understand your question. I'll try to answer as I understand it. For multiple profiles for the same program, user can specify which profile to use in the command line (go build -pgo=<profile>). If a program has different workloads, only the user would know what workloads to optimize for in a particular build. Or, if one wants to build a single binary that works reasonably well for different workloads, they can merge the profiles before the build.

For module developers, there is no reason not to think that they are actually the specialists on what performs or not, and not allowing them to "chip in" with knowledge (either by manual inline pragma hints or by supplying profiling data to be merged at compile time) would be a shame.

Module/library profiles are on our road map, and we definitely want to support it in the future. It is just that it still requires more thinking and designing, so not done in Go 1.20. Thanks for understanding.

gopherbot commented 1 year ago

Change https://go.dev/cl/453636 mentions this issue: doc/go1.20: add release notes for PGO

gopherbot commented 1 year ago

Change https://go.dev/cl/463684 mentions this issue: _content/doc: add PGO usage guide

aclements commented 1 year ago

A preview of PGO support will be available in Go 1.20 (to be released very soon). :tada: The user guide is available here.

We have a few more problems to tackle before we're ready to call PGO generally available, which we're planning to work on for 1.21:

Then the next priority is tackling the long list of further PGO-based optimizations.

gopherbot commented 1 year ago

Change https://go.dev/cl/464477 mentions this issue: _content/doc/go1.20: add link to PGO user guide

gopherbot commented 1 year ago

Change https://go.dev/cl/474236 mentions this issue: cmd/go: enable -pgo=auto by default

aclements commented 1 year ago

I've filed a new umbrella issue to track PGO optimization opportunities: https://github.com/golang/go/issues/62463

alexanius commented 7 months ago

I read the discussion of some optimizations that need the basic block counters (unrolling, pgo register allocation, basic block ordering), but currently I do not see any issues or PRs or discussion of this feature. Is there any proposal or plans for implementing it?

aclements commented 7 months ago

It's in the list of PGO optimization opportunities we'd like to pursue, but we don't yet have specific plans. We likely need discriminators in profiles (#59612) before we can do basic block-level PGO optimizations.