Open aclements opened 4 years ago
Change https://golang.org/cl/308929 mentions this issue: cmd/compile/abi-internal: declare X15 scratch in function bodies
Change https://golang.org/cl/309009 mentions this issue: cmd/compile/abi-internal: declare R14 completely fixed
Change https://golang.org/cl/309034 mentions this issue: runtime: move zero-sized frame check from newproc to newproc1
Change https://golang.org/cl/309110 mentions this issue: cmd/compile: don't modify underlying type when creating bitmap for bodyless function
Change https://golang.org/cl/309169 mentions this issue: runtime: update debug call protocol for register ABI
Change https://golang.org/cl/309340 mentions this issue: cmd/link: build dynexp symbol list directly
Change https://golang.org/cl/309339 mentions this issue: cmd/link: move cgo export map from loadcgo to setCgoAttr
Change https://golang.org/cl/309341 mentions this issue: cmd/compile,cmd/link: resolve cgo symbols to the correct Go ABI
Change https://golang.org/cl/309338 mentions this issue: cmd/link: refactor setCgoAttr
Change https://golang.org/cl/309649 mentions this issue: cmd/compile: rescue stmt boundaries from OpArgXXXReg and OpSelectN.
Change https://golang.org/cl/309634 mentions this issue: runtime: eliminate externalthreadhandler
Change https://golang.org/cl/309789 mentions this issue: cmd/asm: require NOSPLIT for ABIInternal asm functions
Change https://golang.org/cl/309790 mentions this issue: cmd/internal/obj: don't emit args_stackmap for ABIInternal asm funcs
Change https://golang.org/cl/310171 mentions this issue: cmd/internal/objabi,test: use correct GOEXPERIMENT build tags in test/run.go
Change https://golang.org/cl/310172 mentions this issue: cmd/internal/objabi: enable regabiwrappers by default
Change https://golang.org/cl/310174 mentions this issue: cmd/internal/objabi: enable regabireflect by default
Change https://golang.org/cl/310175 mentions this issue: cmd/internal/objabi: enable regabidefer by default
Change https://golang.org/cl/310173 mentions this issue: cmd/internal/objabi: enable regabig by default
Change https://golang.org/cl/310176 mentions this issue: cmd/internal/objabi: enable regabiargs by default
Change https://golang.org/cl/310077 mentions this issue: runtime: fix data race in abi finalizer test
Change https://golang.org/cl/310184 mentions this issue: internal/bytealg: port more performance-critical functions to ABIInternal
Change https://golang.org/cl/310331 mentions this issue: math: avoid assembly stubs
Change https://golang.org/cl/310609 mentions this issue: cmd/internal/objabi: make regabi an alias for regabi sub-experiments
Change https://golang.org/cl/310649 mentions this issue: test/abi: reenable test on windows
Change https://golang.org/cl/310630 mentions this issue: runtime: fix race in TestFinalizerRegisterABI
Change https://golang.org/cl/310729 mentions this issue: cmd/link: fix file-local checks in xcoff
Change https://golang.org/cl/310733 mentions this issue: runtime: mark stdcallN functions cgo_unsafe_args
Change https://golang.org/cl/310690 mentions this issue: cmd/compile: spill all the parameters around morestack
Change https://golang.org/cl/310849 mentions this issue: internal/buildcfg: make regabi enable regabiargs
all.bash is now passing with all regabi experiments enabled on linux/amd64, darwin/amd64, and windows/amd64. We're testing on large code bases now. The results of that will ultimately determine whether we launch this in 1.17, but things appear to be in pretty good shape now. There's still some work to be done. In particular, we know we've regressed debug quality and need to address that. However, we believe it's solid enough to enable by default on these three platforms, and plan to trickle out the CLs to enable each piece over the next couple days, so that if it does cause issues on the dashboard we can better isolate which piece was responsible.
The results are looking phenomenal. On a set of application benchmarks we've been using, geomean improvement is 6%:
base regabi
sec/op sec/op vs base
BiogoIgor 15.66 ± 1% 15.09 ± 1% -3.65% (p=0.000 n=20)
BiogoKrishna 17.95 ± 1% 17.52 ± 2% -2.42% (p=0.001 n=20)
BleveIndexBatch100 5.911 ± 1% 5.511 ± 2% -6.76% (p=0.000 n=20)
BleveQuery 6.481 ± 2% 5.546 ± 1% -14.43% (p=0.000 n=20)
FoglemanFauxGLRenderRotateBoat 8.673 ± 0% 8.411 ± 0% -3.02% (p=0.000 n=20)
FoglemanPathTraceRenderGopherIter1 20.05 ± 1% 20.28 ± 2% ~ (p=0.052 n=20)
GopherLuaKNucleotide 30.25 ± 1% 25.51 ± 1% -15.67% (p=0.000 n=20)
MarkdownRenderXHTML 246.8m ± 2% 240.6m ± 2% -2.53% (p=0.001 n=20)
Tile38WithinCircle100kmRequest 804.8µ ± 2% 754.0µ ± 2% -6.32% (p=0.000 n=20)
Tile38IntersectsCircle100kmRequest 1010.2µ ± 2% 908.7µ ± 2% -10.04% (p=0.000 n=20)
Tile38KNearestLimit100Request 1.019m ± 2% 1.015m ± 1% ~ (p=0.314 n=20)
geomean 666.8m 627.0m -5.98%
And the Bent benchmarks are 6.5% faster geomean, with many seeing much larger improvements. 3 of the 94 benchmarks see non-trivial slowdowns, which we're looking into, though we don't consider this blocking.
I'm going to be away for the next few months, so I'm super excited we were able to get to this point. Everyone who's been working on this has done a phenomenal job. These last few weeks in particular have been an amazing push for the finish line and I hope everyone gets some well-deserved time to relax soon.
@aclements I've been looking for go version benchmarks- could you add a link to where/how you've found these? Are there any plans for adding the register ABI to darwin/arm64?
@andig, @mknyszek put together the benchmark suite. He's been working on open-sourcing it, but I'm not sure what the current status is.
Are there any plans for adding the register ABI to darwin/arm64?
Yes. ARM64 is the next architecture we plan to work on, and Darwin will definitely be included in that.
Change https://golang.org/cl/311689 mentions this issue: cmd/compile: preserve/translate names for parameters
Change https://golang.org/cl/312429 mentions this issue: runtime: fix runtimeQPC ABIInternal call
Change https://golang.org/cl/312650 mentions this issue: runtime: make nanotime1 ABIInternal on Windows
Change https://golang.org/cl/312869 mentions this issue: cmd/compile: fix bug in defer wrapping
Do we have any numbers or guesses as to how this will end up affecting binary sizes? I originally imagined it wouldn't be a significant change either way, but I just noticed https://golang.org/cl/304470 talking about a 1.5% size increase for the sake of stack traces, and it got me thinking.
Change https://golang.org/cl/312989 mentions this issue: cmd/compile: spos handling fixes to improve prolog debuggability
Binary sizes are decreasing, even with that CL. @dr2chase may have better numbers.
Change https://golang.org/cl/313071 mentions this issue: dashboard: convert regabi builders to noregabi
Change https://golang.org/cl/313212 mentions this issue: cmd/compile: make the stack allocator more careful about register args.
Change https://golang.org/cl/314431 mentions this issue: cmd/compile: regabi support for DWARF location expressions
Change https://golang.org/cl/315071 mentions this issue: cmd/compile: handle field padding for register-passed structs
Change https://golang.org/cl/315390 mentions this issue: internal/buildcfg: enable regabi for Android
Change https://golang.org/cl/315610 mentions this issue: cmd/compile: fix abbrev selection for output params
@andig, @mknyszek put together the benchmark suite. He's been working on open-sourcing it, but I'm not sure what the current status is.
@mknyszek I would really be interested in how Go measured performance evolved over time and what to expect of the next major version. It would be great if there was a benchmark corpus and maybe a repo/gist with collected data.
@andig, yeah, that's the plan going forward, but I haven't had the time yet. I'm hoping to do more around this in the second half of 2021.
FWIW, there is some work being done here with microbenchmarks, e.g. golang.org/cl/290431, but it will take time to get something complete up and running.
This work will continue into Go 1.18, but at this point I don't think any major new things will be happening here for the Go 1.17 release. Kicking it to the next milestone.
@mknyszek so does it mean that register-based calling convention is delayed until Go 1.18?
@mknyszek so does it mean that register-based calling convention is delayed until Go 1.18?
Go 1.17 will enable the register abi for GOARCH=amd64 systems. The 1.18 activity to which @mknyszek is referring is the work to enable the register abi for other architectures (for example, arm64). The new 1.18 milestone for this issue is for that work.
I propose that we switch the Go internal ABI (used between Go functions) from stack-based to register-based argument and result passing for Go ~1.16~ 1.17.
I lay out the details of our proposal and the work required in this document.
The ABI specification can be found here.
This was previously proposed in #18597, where @dr2chase did some excellent prototyping work that our new proposal builds on. With multiple ABI support, we’re now in a much better position to execute on this without breaking compatibility with the existing body of Go assembly code. I’ve opened this new issue to focus on discussion of our new proposal.
/cc @dr2chase @danscales @thanm @cherrymui @mknyszek @prattmic @randall77
An incomplete and evolving list of tasks:
defer recover()
(@cherrymui, CL)RegEntryTmp
registers in prologue (@mknyszek, CL)WriteFuncMap
) (@dr2chase, CL)cgo_unsafe_args
generate an ABI0 function (or an ABIInternal-with-no-registers function) (@cherrymui, CL)High-priority non-critical path
funcPC
always return the "native" PC of a function (maybe also introduceABIOther
) (@cherrymui, CL)funcPC
is inargs_stackmap
in ABIInternal functions (because it's for ABI0) (CL)reflect.ValueOf(fn).Pointer()
to get the PC of an assembly function will now return the PC of the ABI wrapperEnabling steps
Testing
reflect.{ValueOf(target),MakeFunc(x, target),Method(x)}.{Call,Interface}
, called fromdefer
, called fromgo
, called as a finalizerdefer
in a test function (check arguments, modify results)MoveStackOnNextCall
, assertions for pointer-to-stack/pointer-to-heap, assertions for live/dead (@aclements, CL)Post-MVP
Cleanup (can be done later)
go
in runtime (context)ir.Func
to functionir.Name
s (context)firstpos
inssa.go
– sometimes it's late.