Open aclements opened 4 years ago
Once upon a time, I was one of several people whose primary work activity for I think two days was tracking down a weird kernel crash, which was ultimately caused by callee-saved registers. Well, the crash was caused by a [FOO_MAX] array being overrun. But the thing where the result of that overrun was to overwrite a single automatic variable, which never had its address taken, five call frames away? That was caused by callee-saved registers.
Which is to say:
Platform ABIs typically define callee-save registers, which place substantial additional requirements on a garbage collector. There are alternatives to callee-save registers that share many of their benefits, while being much better suited to Go.
+1 +1 +1 +1 +1 +1 [...]
If I'm understanding the proposal correctly, assembler code will continue to be written with the current stack based calling conventions.
I understand the excellent reasoning behind doing this, I'll just note that this is surely going to frustrate assembly code writers not being able to pass and receive args in registers as often assembly code fragments are tiny and the overhead of calling them is massive.
@ncw This restriction to ABI0
for assembly writers isn't necessarily permanent. See https://go.googlesource.com/proposal/+/master/design/27539-internal-abi.md#proposal.
Right now writeBarrier is a special builtin that allows arguments to be passed in registers. Couldn't that same concept be extended to others like memmove and memcpy to avoid call overhead even though they are asm?
@laboger , yes, we're considering special-casing just a few assembly functions. The hottest ones by far are memmove
, memclrNoHeapPointers
, and possibly syscall.Syscall
. The write barrier calling convention is very specialized because its context is so unusual and performance-sensitive. We'll probably just switch the others to use the register ABI rather than anything more specialized (and teach the toolchain that those are special even though they're assembly).
A comment on the "stack growth" section of the doc, specifically about spilling register state for morestack
into the guard space for a goroutine (copied from Gerrit):
I worry a little bit about cases where functions marked nosplit
eventually call into a non-nosplit
function. If the nosplit
frames are large (but don't exceed the guard space) we might find out that we don't have any space left. What should we do in this case? Throwing doesn't seem quite right since this works just fine today (though, you have to be careful). Should we have "guard space" and a subset of that as "no, really, this is very guarded" space? That would require extending the guard space a bit further which isn't great...
One alternative is to manage per-goroutine space for this, but that's not great either, since its more memory used per goroutine.
Good point. This is especially true for architectures that have a lot registers. Maybe we could decide that some registers are never live across a call?
That's an excellent point. A few possible solutions come to mind:
It would be really sad if we had to make more per-goroutine space for this since you almost always have space on the stack. One possibility is to spill to the stack if there's room, and otherwise we keep a pool of allocated "spill space" objects. For that, we could keep just one pre-allocated per P and if the goroutine needs it, it can pull it off the P, spill into it, and then once it's entered the runtime it can allocate a new one (probably getting it from a global allocation pool) and attach it to the P. On return, it can return it to the allocation pool.
A similar option would be to have just one spill space per P. Again, use the stack if you can, otherwise spill into the P, enter the runtime, grow the stack even if it's just a preemption, and then dump the spilled registers back onto the stack in the right place.
We could revisit the nosplit rules. It's often (though not always) a mistake to call from a nosplit function to a non-nosplit function. I'm not sure exactly what this would look like. Maybe a non-nosplit function called from a nosplit function has to have a special prologue? Related: https://github.com/golang/go/issues/21314#issuecomment-346480985. I think the only cases I enumerated in that comment that intentionally call from a nosplit to a non-nosplit function aren't running on a user G stack and thus can't grow it anyway.
Change https://golang.org/cl/252258 mentions this issue: cmd/asm: define a macro for GOEXPERIMENT=regabi
Change https://golang.org/cl/252257 mentions this issue: cmd/internal/objabi: add regabi GOEXPERIMENT
Change https://golang.org/cl/258938 mentions this issue: runtime,cmd/cgo: simplify C -> Go call path
@alexbrainman , I just ran into a bit of a snag on Windows that maybe you can better inform me on. It looks like syscall.NewCallback
is going to also need an implementation of the Go internal ABI rules and I'm trying to figure out exactly what that needs to look like. (I pinged you because it looks like you last substantially touched that code, but feel free to pass the ping.)
The documentation says "The argument is expected to be a function with one uintptr-sized result. The function must not have arguments with size larger than the size of uintptr." Does this actually mean the arguments (and result) must be an integral, floating-point, or pointer type? Or does it mean, say, struct { x, y uint16 }
or struct { x [4]byte }
is allowed, since those are technically no larger than the size of a uintptr? The current implementation permits the latter, though I don't know if it actually works.
If they must be int/float/pointer, then mapping from the Windows ABI to the Go register ABI is relatively straightforward. If compounds are allowed, then this gets much harder.
cc @jstarks re the ABI question.
@aclements I reckon there are very few instances of syscall.NewCallback usage. But they are used a lot.
If you want to build Windows GUI, you have to use syscall.NewCallback, because Windows GUI API requires syscall.NewCallback.
Similarly, if you want to build Windows service, you have to use syscall.NewCallback.
These two pretty much cover syscall.NewCallback usage.
For Windows service, you can look in golang.org/x/sys/windows/svc. It uses this RegisterServiceCtrlHandlerEx WIndows API
https://docs.microsoft.com/en-us/windows/win32/api/winsvc/nf-winsvc-registerservicectrlhandlerexw
which requires LPHANDLER_FUNCTION_EX
https://docs.microsoft.com/en-us/windows/win32/api/winsvc/nc-winsvc-lphandler_function_ex
so LPHANDLER_FUNCTION_EX returns DWORD (uint32).
For GUI grep in golang.org/x/exp/shiny - it uses RegisterClass
https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-registerclassw
which uses WNDCLASSW.lpfnWndProc
https://docs.microsoft.com/en-us/windows/win32/api/winuser/ns-winuser-wndclassw
which is WNDPROC
https://docs.microsoft.com/en-us/previous-versions/windows/desktop/legacy/ms633573(v=vs.85)
which returns LRESULT which is LONG_PTR or uintptr (in Go)
https://docs.microsoft.com/en-us/windows/win32/winprog/windows-data-types
Also I recommend you look at github.com/lxn/walk for their syscall.NewCallback usage. This is an ultimate package to create GUI apps. But, I doubt, you will find different usage than golang.org/x/exp/shine. (I did not look myself)
I reckon, that is about it. For Microsoft APIs (maybe google code for syscall.NewCallback or windows.NewCallback).
But some people might use it in their custom / proprietary interface. I am not sure how these are supported by Go. You will be the judge.
Alex
IIUC this example in the microsoft docs shows that compounds are allowed:
struct Struct2 {
int j, k; // Struct2 fits in 64 bits, and meets requirements for return by value.
};
Struct2 func4(int a, double b, int c, float d);
// Caller passes a in RCX, b in XMM1, c in R8, and d in XMM3;
// callee returns Struct2 result by value in RAX.
Change https://golang.org/cl/262197 mentions this issue: go/analysis/passes/asmdecl: add support for ABI selector clauses
Change https://golang.org/cl/262319 mentions this issue: reflect,runtime: use internal ABI for selected ASM routines
Change https://golang.org/cl/262317 mentions this issue: cmd/dist,cmd/go: broaden use of asm macro GOEXPERIMENT_REGABI
Change https://golang.org/cl/259445 mentions this issue: cmd/compile,cmd/link: initial support for ABI wrappers
Change https://golang.org/cl/260477 mentions this issue: cmd/asm: allow def/ref of func ABI when compiling runtime
Change https://golang.org/cl/262318 mentions this issue: cmd: go get golang.org/x/tools@d1624618 && go mod vendor
Change https://golang.org/cl/262117 mentions this issue: cmd/compile: delay expansion of OpArg until expand_calls
Change https://golang.org/cl/249458 mentions this issue: cmd/compile: avoid generating CSEs; do all aggregates; maintain debug names
Change https://golang.org/cl/263271 mentions this issue: runtime: fix sub-uintptr-sized Windows callback arguments
Just an update on this: we were hoping to land this for 1.16, but that was always going to be a bit of a stretch, and at this point we won't be able to make 1.16. Since we're making good progress and have a lot of work in flight, the current plan is to branch when the freeze comes and finish up the work there. Landing this at the beginning of 1.17 isn't the worst thing since that will give it lots of soak time. There's some concern about the risk of landing both this and generics in 1.17, but I believe they will be fairly orthogonal and the nature of the register ABI work is such that we should be able to test it pretty thoroughly and things are likely to fail loudly if there are bugs.
There's some concern about the risk of landing both this and generics in 1.17
1.17 -> 2.0?
It sounds like everyone involved is on board for doing this in Go 1.17. Adding to proposal minutes for wider visiblity, but seems headed for likely accept.
An incomplete and evolving list of tasks:
@aclements Consider adding a bullet point tracking the implementation of DWARF locations for arguments and results (contained in registers). Having good support for Go calling conventions in GDB and Delve is well worth a tracking bullet point :)
This work is already outlined in https://go.googlesource.com/proposal/+/master/design/40724-register-calling.md
DWARF locations: The compiler will need to generate DWARF location lists for arguments and results. It already has this ability for local variables, and we should reuse that as much as possible. We will need to ensure Delve and GDB are compatible with this. Both already support location lists in general, so this is unlikely to require much (if any) work in these debuggers.
@mewmew , thanks, I've added some bullets to the list for this. Hopefully there's nothing to actually do here, but you're right it's good to track that we need to make sure this works.
Change https://golang.org/cl/258137 mentions this issue: reflect,runtime: support amd64 register ABI for reflect.Call
Change https://golang.org/cl/264437 mentions this issue: internal/abi: add new internal/abi package for ABI constants
Change https://golang.org/cl/264438 mentions this issue: runtime: support register ABI for finalizers
Based on the discussion above, this seems like a likely accept.
Change https://golang.org/cl/266638 mentions this issue: reflect,runtime: use internal ABI for selected ASM routines, attempt 2
No change in consensus, so accepted.
Change https://golang.org/cl/270077 mentions this issue: runtime: simplify C -> Go callbacks on windows/arm
Change https://golang.org/cl/270863 mentions this issue: [dev.regabi] cmd/compile,cmd/link: initial support for ABI wrappers
Change https://golang.org/cl/271178 mentions this issue: runtime: support new callbackasm1 calling convention on windows/arm
Change https://golang.org/cl/272570 mentions this issue: runtime: support register ABI for finalizers
Change https://golang.org/cl/272567 mentions this issue: internal/abi: add new internal/abi package for ABI constants
Change https://golang.org/cl/272568 mentions this issue: reflect,runtime: support amd64 register ABI for reflect.Call
Change https://golang.org/cl/279514 mentions this issue: dashboard: add linux-amd64-regabi builder
Change https://golang.org/cl/279515 mentions this issue: dashboard: add linux-amd64-regabi builder
Just want to confirm - this will be feature flagged in go 1.16?
@grantseltzer No, this is scheduled for Go 1.17, it won't be in 1.16.
It was originally planned for 1.16, but that was always going to be tight and we missed the deadline, so now it's for 1.17. I'll update the top post to say 1.17.
Change https://golang.org/cl/279673 mentions this issue: dashboard: add linux-amd64-regabi builder
Got it thanks for update. Regardless, sounds like it'll be feature flagged.
Change https://golang.org/cl/285292 mentions this issue: [dev.regabi] doc: add internal ABI specification
Re @mknyszek's comment:
A comment on the "stack growth" section of the doc, specifically about spilling register state for
morestack
into the guard space for a goroutine (copied from Gerrit):
@thanm had a good idea about eliminating the spill space we're currently reserving in the argument frame: we could charge the necessary spill space to the caller even if it doesn't reserve it. Let's say F calls G. F's prologue would check that there's enough stack space for F's frame as well as G's spill space, even though F would only open its own frame, which wouldn't include G's spill space. When F calls G, if G doesn't have enough space to open its frame (plus the spill space of anything it calls), it would enter the morestack path, and that could safely grow the stack enough to spill G's argument registers because F ensured there would be enough space.
We would have to make sure stack shrinking didn't shrink away this spill space, but it could make conservative assumptions about how much space must be reserved.
I'm not sure how this interacts with whether we have a compiler-generated spill path around the call to morestack or a generic spill path in morestack. It would be nice if the spill space was only as large as necessary for a function, since the maximal spill space is fairly large compared to the typical spill space. It's easy to have a compact spill space if the compiler generates the spill path. If we want a generic spill path with a compact spill space, I think we'd need morestack to switch to g0 using only scratch registers, temporarily spill everything there, lookup the function metadata, and copy the actual argument registers back to the g stack. That's pretty messy. I think we'll want to see how much code space actually goes to the compiler-generated spill paths.
I propose that we switch the Go internal ABI (used between Go functions) from stack-based to register-based argument and result passing for Go ~1.16~ 1.17.
I lay out the details of our proposal and the work required in this document.
The ABI specification can be found here.
This was previously proposed in #18597, where @dr2chase did some excellent prototyping work that our new proposal builds on. With multiple ABI support, we’re now in a much better position to execute on this without breaking compatibility with the existing body of Go assembly code. I’ve opened this new issue to focus on discussion of our new proposal.
/cc @dr2chase @danscales @thanm @cherrymui @mknyszek @prattmic @randall77
An incomplete and evolving list of tasks:
defer recover()
(@cherrymui, CL)RegEntryTmp
registers in prologue (@mknyszek, CL)WriteFuncMap
) (@dr2chase, CL)cgo_unsafe_args
generate an ABI0 function (or an ABIInternal-with-no-registers function) (@cherrymui, CL)High-priority non-critical path
funcPC
always return the "native" PC of a function (maybe also introduceABIOther
) (@cherrymui, CL)funcPC
is inargs_stackmap
in ABIInternal functions (because it's for ABI0) (CL)reflect.ValueOf(fn).Pointer()
to get the PC of an assembly function will now return the PC of the ABI wrapperEnabling steps
Testing
reflect.{ValueOf(target),MakeFunc(x, target),Method(x)}.{Call,Interface}
, called fromdefer
, called fromgo
, called as a finalizerdefer
in a test function (check arguments, modify results)MoveStackOnNextCall
, assertions for pointer-to-stack/pointer-to-heap, assertions for live/dead (@aclements, CL)Post-MVP
Cleanup (can be done later)
go
in runtime (context)ir.Func
to functionir.Name
s (context)firstpos
inssa.go
– sometimes it's late.