all: port to Windows/ARM32

jordanrh1 commented 6 years ago

Hi everyone, we will be submitting a patch in the near future that adds Windows/ARM32 support to GO. All but a few tests are passing, and this implementation has been used to compile GO itself and run Docker containers on Windows/ARM32. We look forward to working with the community to iron out the last remaining issues and get this merged!

cznic commented 6 years ago

I'm curious why was the port developed out-of-tree? I'm asking because I'm afraid it's much more complicated to properly review such a big change, as an port to new architecture is, in a single CL.

In any case, please check https://github.com/golang/go/blob/master/CONTRIBUTING.md and the linked from there https://golang.org/doc/contribute.html for more details.

(Personal wish: Please spell the name of the language as 'Go'.)

alexbrainman commented 6 years ago

@jordanrh1 you should read https://github.com/golang/go/wiki/PortingPolicy

Alex

jordanrh1 commented 6 years ago

@cznic We were under a tight deadline and did not know if it was feasible. This is the earliest I could engage the community. We can break down the change into smaller chunks if necessary.

@alexbrainman Thanks for the link. We are committed to meeting the requirements outlined in the doc.

alexbrainman commented 6 years ago

We are committed to meeting the requirements outlined in the doc.

SGTM.

Alex

gopherbot commented 5 years ago

Change https://golang.org/cl/125643 mentions this issue: dashboard: add windows-arm builder

gopherbot commented 5 years ago

Change https://golang.org/cl/125648 mentions this issue: cmd/link: support windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/125649 mentions this issue: debug/pe: support windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/125645 mentions this issue: runtime: add definitions for windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/125647 mentions this issue: cmd/api: support windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/125646 mentions this issue: cmd/dist: support windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/128715 mentions this issue: runtime: support windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/128716 mentions this issue: syscall: support windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/130056 mentions this issue: cmd/dist: support windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/131282 mentions this issue: cmd/vet: remove exclusions for callbackasm

gopherbot commented 5 years ago

Change https://golang.org/cl/127665 mentions this issue: windows: add support for windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/137015 mentions this issue: cmd/vendor: update golang.org/x/sys/windows for windows/arm support

bcmills commented 5 years ago

Sounds like the builder is still failing (#28854). @jordanrh1, what work remains to have this port stable for the release?

jordanrh1 commented 5 years ago

These are the failing runtime tests:

TestPanicInlined
TestAbort
TestArenaCollision (out of memory)
TestSmhasherTwoNonzero (out of memory)
TestSmhasherSparse
TestSmhasherPermutation
TestStackWrapperStackPanic/sigpanic/CallersFrames
TestNetpollDeadlock (out of memory)

Some are due to out of memory, and the rest I believe are due to issues with traceback.

It looks like there are some new failures in html/template.

alexbrainman commented 5 years ago

Some are due to out of memory,

I am pretty sure it is OK to skip tests that take too much memory on RPI.

and the rest I believe are due to issues with traceback.

Please, see if you can debug and fix it. Feel free to ask questions - everyone will help.

It looks like there are some new failures in html/template.

I can see from your comment https://github.com/golang/go/issues/28854#issuecomment-440520454 it could be same bug.

Alex

gopherbot commented 5 years ago

Change https://golang.org/cl/153518 mentions this issue: builders: skip writing snapshop for windows/arm builder

gopherbot commented 5 years ago

Change https://golang.org/cl/153719 mentions this issue: runtime/pprof: fix TestCPUProfileWithFork on Windows/ARM

gopherbot commented 5 years ago

Change https://golang.org/cl/153718 mentions this issue: runtime: fix profiling on windows/ARM

gopherbot commented 5 years ago

Change https://golang.org/cl/153839 mentions this issue: runtime: pass LR to sigprof on windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/153844 mentions this issue: link/internal/ld: fix TestRuntimeTypeAttrInternal on windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/153830 mentions this issue: cmd/link: skip TestRuntimeTypeAttrInternal on windows/arm

gopherbot commented 5 years ago

Change https://golang.org/cl/154357 mentions this issue: cmd/nm: report windows/arm as relocatable in TestGoExec

jordanrh1 commented 5 years ago

@hyangah Does the order of regions matter in TestAnalyzeAnnotations? This test is failing on windows/arm because the order of regions is different from what is expected.

On Linux:

        task 1: task0
                start: 33721 end: 62861 complete: true
                2 goroutines
                4 regions:
                        task0.region0(goid=25)
                        (goid=26)
                        task0.region1(goid=26)
                        task0.region2(goid=26)
                1 children:
                        task1

On Windows:

task 1: task0
            start: 1006000 end: 673010992 complete: true
            2 goroutines
            4 regions:
                task0.region0(goid=43)
                (goid=44)
                task0.region2(goid=44)
                task0.region1(goid=44)
            1 children:
                task1

On Linux, the order of regions is 0, (unnamed), 1, 2. On Windows, the order of regions is 0, (unnamed), 2, 1. Is ordering significant or should the test be changed to accept an arbitrary ordering of regions?

gopherbot commented 5 years ago

Change https://golang.org/cl/154560 mentions this issue: windows: use netevent.dll in TestFormatMessage for windows/arm

hyangah commented 5 years ago

@jordanrh1 the regions are supposed to be sorted based on the timestamp so I think it needs investigation.

https://github.com/golang/go/blob/a1aafd8b28ada0d40e2cb25fb0762ae171eec558/src/cmd/trace/annotations.go#L349

Can you capture the trace with the command and share it? go test -run=TestAnalyzeAnnotations runtime/trace -savetraces

jordanrh1 commented 5 years ago

@hyangah, Here is the trace, thank you for looking :) TestAnalyzeAnnotations.trace.zip

gopherbot commented 5 years ago

Change https://golang.org/cl/154777 mentions this issue: cmd/trace: force regions order in TestAnalyzeAnnotations

hyangah commented 5 years ago

@jordanrh1, thanks you for the trace. According to the trace, the system didn't seem to provide high-enough time resolution required for this testing or for the execution tracer to be useful. All the traced events were timestamped either 0 or 1013008ns. So, sorting based on the timestamp doesn't guarantee a unique ordering this test assumes.

I sent the cl/154777 to slow the traced program (hoping that 1ms gap between events is sufficient). Alternatively, we can just skip the test on windows.

jordanrh1 commented 5 years ago

@hyangah Thanks for looking. The high resolution clock runs at 1Mhz on this machine, which I think would be fast enough to get meaningful timestamps. There could be a bug in the timestamp code. I will investigate.

hyangah commented 5 years ago

@jordanrh1 I see. The traced code is just creating the events back to back, so requiring them to have distinct timestamps is unreasonable. They all can happen within 1usec and end up with the same timestamp. So, I think the assumption the test makes is wrong. I will just sort the region list based on the name.

gopherbot commented 5 years ago

Change https://golang.org/cl/154817 mentions this issue: windows/svc: use wevtutil.exe instead of powershell for windows/arm

jordanrh1 commented 5 years ago

@hyangah I'm seeing a resolution of about 1ms in nanotime() on windows/arm. I think there is a problem with the implementation.

gopherbot commented 5 years ago

Change https://golang.org/cl/154758 mentions this issue: runtime: use QPC for nanotime and time.now on windows/arm

jordanrh1 commented 5 years ago

@hyangah FYI, the test is passing after fixing the time functions.

gopherbot commented 5 years ago

Change https://golang.org/cl/154761 mentions this issue: Revert "runtime: use QPC for nanotime and time.now on windows/arm"

gopherbot commented 5 years ago

Change https://golang.org/cl/154762 mentions this issue: runtime: use QPC to implement cputicks() on windows/arm

jordanrh1 commented 5 years ago

Getting closer:

The failing test is TestCallbackPanic(). I'm suspecting stack corruption.

I also want to modify cputicks() to call onosstack(_QueryPerformanceCounter, ...), and I want to modify onosstack() to not modify any global data. These routines are called deep in the scheduler and I am concerned they could be corrupting state.

jordanrh1 commented 5 years ago

Here is the issue with TestCallbackPanic().

TestCallbackPanic() tests what happens when Go code running in a callback panics. Go loads a native dll (user32.dll), calls a function that is expected to invoke a callback (EnumWindows), and then panics within the callback. It registers deferrals at several points, and verifies that the deferrals run and that the panic can be recovered from. In the Go function that calls EnumWindows, it registers a deferral to unload user32.dll when the function returns. That way, user32.dll will be unloaded no matter what, even if EnumWindows panics. This function looks like this:

// nestedCall calls into Windows, back into Go, and finally to f.
func nestedCall(t *testing.T, f func()) {
        c := syscall.NewCallback(callback)
        d := GetDLL(t, "user32.dll")
        defer d.Release()
        d.Proc("EnumWindows").Call(c, uintptr(*(*unsafe.Pointer)(unsafe.Pointer(&f))))
}

The problem arises in the interaction between EnumWindows and Go's panic unwinding logic. When Go encounters a panic in a callback, it unwinds the g0 stack to the point before the outgoing cgo call that invoked the callback was made. That means in this case it will reset the g0 stack pointer to what it was just before the call to EnumWindows. If you look at the stack pointer on entry of EnumWindows, and on entry of FreeLibrary (which is called by defer d.Release()), they are the same value. It is my understanding that this is by design. The logic that restores the stack pointer is in func unwindm:

                 // Restore sp saved by cgocallback during
                // unwind of g's stack (see comment at top of file).
                mp := acquirem()
                sched := &mp.g0.sched
                switch GOARCH {
                default:
                        throw("unwindm not implemented")
                case "386", "amd64", "arm", "ppc64", "ppc64le", "mips64", "mips64le", "s390x", "mips", "mipsle":
                        sched.sp = *(*uintptr)(unsafe.Pointer(sched.sp + sys.MinFrameSize)) // <-- restore sp from saved value on the stack
                case "arm64":
                        sched.sp = *(*uintptr)(unsafe.Pointer(sched.sp + 16))
                }

It restores the g0 stack pointer from the value saved on the stack by cgocallback. cgocallback saves m->g0->sched.sp to the stack, which was previously written by cgocall when EnumWindows was called.

havem:
        // Now there's a valid m, and we're running on its m->g0.
        // Save current m->g0->sched.sp on stack and then set it to SP.
        // Save current sp in m->g0->sched.sp in preparation for
        // switch back to m->curg stack.
        // NOTE: unwindm knows that the saved g->sched.sp is at 4(R13) aka savedsp-8(SP).
        MOVW    m_g0(R8), R3
        MOVW    (g_sched+gobuf_sp)(R3), R4     <--- load last saved value of g0's SP (when EnumWindows was invoked)
        MOVW    R4, savedsp-8(SP)              <--- save this value to the stack, where unwindm knows where to find it
        MOVW    R13, (g_sched+gobuf_sp)(R3)

So, the fact that we see FreeLibrary execute with the same stack pointer as EnumWindows seems to be by design. EnumWindows was not allowed to complete since a panic occurred while it was calling back into Go code, and it's stack was deallocated. This shouldn't be an issue, since it is generally an error to maintain references to stack local memory that outlive the stack frame.

This is where things get weird. EnumWindows isn't as straightforward as you might think. It allocates memory from the heap before invoking the callback, and frees this memory before it returns. It uses two mechanisms to ensure this memory gets cleaned up in case the callback experiences an error:

it tracks the allocations with a heap tracking mechanism. Bookkeeping for this allocation is stored on the stack
it has an SEH cleanup handler that frees the allocations and unlinks the bookkeeping information from the global tracking database

When the user32 module is unloaded, its garbage collector runs and scans for outstanding resources, which include outgoing calls that have not been completed, and their associated resources. Since the SEH handler never ran (because an explicit call to panic() does not trigger SEH handlers), the global tracking database still has a reference to the bookkeeping information that was stored on the stack. However, this stack space was deallocated, and new code is running over it, which corrupts the data. The bookkeeping code faults when it loads a bad pointer from the stack.

As you can see, this error occurs due to very specific interaction between Go's deferral mechanism, Go's callback mechanism, and user32's resource tracking mechanism. In order for C code to be go-panic-safe, it should not allocate any resources, and should not maintain any references to stack memory.

In terms of fixes, I'm not sure there's a way to ensure that C code which relies on SEH or C++ exceptions has a chance to clean up in case of a go panic. It seems best to recover() in the callback, and not panic across callback boundaries. Another option is to find a replacement for EnumWindows that is safe to panic across and use that for the test instead. @alexbrainman @ianlancetaylor , what do you think?

ianlancetaylor commented 5 years ago

Any idea why the test passes on x86? What's the key difference?

jordanrh1 commented 5 years ago

I think it has to do with how subsequent code overwrites the stack. If the data structure on the stack doesn't get overwritten, the error will not manifest. amd64 and x86 have the same underlying issue, but they must not corrupt the bookkeeping datastructure.

alexbrainman commented 5 years ago

I'm not sure there's a way to ensure that C code which relies on SEH or C++ exceptions has a chance to clean up in case of a go panic.

I agree. This is not implemented, and it won't be simple to implement.

It seems best to recover() in the callback, and not panic across callback boundaries.

I agree in general, but I don't think it is acceptable for our runtime tests. I think Go users write code that panics from callback and into call site function. We should try and make it work, if possible.

Another option is to find a replacement for EnumWindows that is safe to panic across and use that for the test instead

That sounds good. I don't remember who came up with EnumWindows to test it all, but anything else should be fine too. As long as it works and easily available on any Windows computer that is supported by Go.

Alex

jordanrh1 commented 5 years ago

@ianlancetaylor One of the reasons it doesn't readily manifest on amd64/386 is that amd64/386 tests run on Windows Desktop, which has a different implementation of user32 than IoT Core, which is where the ARM tests run. Iot Core uses 'minuser', which is a completely rewritten window manager. I was able to reproduce the exact crash on AMD64 by running the test on IoT Core. In the deferral, I called a function that used a large chunk of g0 stack (in a legal way), to ensure that any stack based data structures still referenced would be overwritten with garbage data. Later, when the user32 garbage collector crashed, it crashed in exactly the same place as ARM, because the stack-based bookkeeping structure had been overwritten in the deferral.

On ARM, to further confirm that stack corruption was the issue, I inserted code in callbackasm1 to save the current g0 SP so that when unwindm restores the g0 SP, it will restore it to a value that is safely below the stack for EnumWindows. With this change, the crash went away because EnumWindow's stack was no longer being overwritten in the deferral. The test ran many times successfully.

Given these findings I have increased confidence that this is the issue.

gopherbot commented 5 years ago

Change https://golang.org/cl/155923 mentions this issue: runtime: use EnumTimeFormatsEx instead of EnumWindows in callback tests

jordanrh1 commented 5 years ago

Exciting stuff:

petemoore commented 5 years ago

@jordanrh1 Is there an issue for porting to Windows/ARM64 too? Thanks.

jordanrh1 commented 5 years ago

@petemoore There is not.

golang / go

all: port to Windows/ARM32 #26148