golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.86k stars 17.65k forks source link

runtime, x/mobile: SIGABRT ABORT, runtime.raise_trampoline.abi0 + 462 (sys_darwin_arm64.s:462) for Mac Catalyst targets running on arm64 #52299

Open CatStudioApp opened 2 years ago

CatStudioApp commented 2 years ago

What version of Go are you using (go version)?

$ go version
go version go1.18 darwin/arm64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/[user]/Library/Caches/go-build"
GOENV="/Users/[user]/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/[user]/golang/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/[user]/golang"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.18/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.18/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.18"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/cd/j79005393f7_zbyjw3zpkrlc0000gn/T/go-build3399080997=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I delivered my app continuously using Gomobile to MacCatalyst target. Starting from Go 1.18, my app started to have such a crash:

Crashed: Thread #1
SIGABRT ABORT 0x0000000186370db8
Crashed: Thread
0  libsystem_kernel.dylib         0x9db8 __pthread_kill + 8
1  libsystem_pthread.dylib        0x6ee0 pthread_kill + 288
2  libsystem_c.dylib              0x42674 raise + 32
3  [my_app_target_with_gomobile]                      0x189104 runtime.raise_trampoline.abi0 + 462 (sys_darwin_arm64.s:462)
4  [my_app_target_with_gomobile]                      0x187ee8 runtime.asmcgocall.abi0 + 1094 (asm_arm64.s:1094)
5  ???                            0xed65f03c0 (Missing)

What did you expect to see?

No crash

What did you see instead?

Crashes

cherrymui commented 2 years ago

Catalyst isn't a supported target for Go yet. See https://github.com/golang/go/issues/47228#issuecomment-922047077

CatStudioApp commented 2 years ago

@cherrymui we have actually been using Gomobile for Catalyst for over six months and this crash only appeared after Go 1.18…

jiangzhiguo1992 commented 1 year ago

@CatStudioApp I also encountered the same problem, any follow-up?

stffabi commented 1 year ago

I've seen the exact same problem for our app when upgrading from Go 1.17 to 1.18 and building for catalyst. Our app has run without any problem for about 2 years on catalyst.

Bisecting Go 1.17 and Go 1.18 releases showed the problem is caused by the commit a4b2c579e92c001287fd7b549bc9f67bcf2bfe7b that was used to fix this issue #49288

Would be interesting if someone of the Go team could maybe bring some light up into this. For catalyst is GOOS=darwin, so it seems like somehow macOS does not like the async preemption signal handling when running in catalyst mode.

cherrymui commented 1 year ago

If you want to disable async preemption, you could try setting the environment variable GODEBUG=asyncpreemptoff=1.

Is the preemption signal the only issue, or signals in general? Could you try sending some signals to you program, and see if that works? It would be good to know why preemption signals don't work in catalyst mode.

stffabi commented 1 year ago

Thanks so much @cherrymui for your reply.

If you want to disable async preemption, you could try setting the environment variable GODEBUG=asyncpreemptoff=1.

This is pretty hard to achieve because in my case the Go code runs in a network extension started by the network extension framework. But I patched Go to internally set debug.asyncpreemptoff = 1 and that fixed the issue.

Is the preemption signal the only issue, or signals in general? Could you try sending some signals to you program, and see if that works?

I tried sending SIGURG (which is used for preemption according to signal_unix.go) and that did not crash the app, I also tried SIGWINCH which also did not crash the app. (Edit: when asyncpreemptoff was off)

Commit a4b2c579e92c001287fd7b549bc9f67bcf2bfe7b seems to only install the sighandler also for const sigPreempt = _SIGURG. I don't really know the runtime code, but from the brief look I took into it I would suspect the preempt signal is send regardless of that commit. But without that commit it is just ignored and not handled because sighandler is not installed.

stffabi commented 1 year ago

So after an in-depth investigation here are the results and what is going on in this case.

First the crash is due to a throw during signal handling signal 16 received on thread with no signal stack. non-Go code disabled sigaltstack (noSignalStack). Unfortunately the console output never showed up, maybe related to being used in an Apple NetworkExtension. The problem only occurred when building with gomobile for MacCatalyst (arm64) gomobile bind -target maccatalyst and not for iOS gomobile bind -target ios.

Further investigation showed that on iOS sigaltstack handling is never used and there's ASM code to do all this signal stack changes directly in Go ASM. Due to the fact that sigaltstack was buggy on older iOS versions < 13. So that's why there's no problem with the same app when running on iOS because sigaltstack is never used and the noSignalStack() can never be reached.

https://github.com/golang/go/blob/581603cb7d02019bbf4ff508014038f3120a3dcb/src/runtime/os_darwin.go#L317-L321

https://github.com/golang/go/blob/581603cb7d02019bbf4ff508014038f3120a3dcb/src/runtime/sys_darwin_arm64.s#L210-L220

It was quite interesting to find out why this is a problem when using MacCatalyst (darwin) where sigaltstack is obviously supported by Apple. Printing out the sigaltstack configuration when running a goroutine revealed that the st.ss_flags&_SS_DISABLE was always true, despite it should never be due to the call to minitSignalStack which should be called for a darwin build:

https://github.com/golang/go/blob/581603cb7d02019bbf4ff508014038f3120a3dcb/src/runtime/os_darwin.go#L317-L321

I found out that minitSignalStack is really not called for the MacCatalyst build and this revealed an interesting inconsistency between runtime.GOOS and the ASM define #ifdef GOOS_ depending on the GOOS env and the used build tags.

First let's take a look what gomobile is going to use for building a MacCatalyst app, it is going to use GOOS=darwin and build tags ios, macos, maccatalyst. The build tag ios is used due to another issue #47228.

https://github.com/golang/mobile/blob/7088062f872dd0678a87e8986c67992e9c8855a5/cmd/gomobile/env.go#L56-L99

This GOOS and build-tag configuration now has the following consequences. Using GOOS=darwin and build tags ios, macos, maccatalyst results in the following final set of build tags used for compilation darwin, ios, macos, maccatalyst, macos and maccatalyst could be ignored they are never used in the Go source tree. So for our investigation only the build tags darwin, ios and GOOS=darwin is relevant.

GOOS=darwin leads to a Go ASM define #define GOOS_darwin which is fine and as a result sys_darwin_arm64.s does not include the above mentioned manual signal stack handling as it would with GOOS=ios, so this is prepared for sigaltstack handling. Now let's take a look at the call to minitSignalStack in os_darwin.go which is constraint with if !(GOOS == "ios" && GOARCH == "arm64") { so this should be called for our build because we use GOOS=darwin. But if we take a look how runtime.GOOS is defined we see that with the buid tags darwin, ios the resulting runtime.GOOS is ios and not darwin.

https://github.com/golang/go/blob/581603cb7d02019bbf4ff508014038f3120a3dcb/src/internal/goos/zgoos_ios.go#L3-L7

Now we have a build that on the ASM side is built for GOOS=darwin but on the runtime side (all code that uses runtime.GOOS) behaves in the way as if one would have used GOOS=ios. As a result sys_darwin_arm64.s behaves as if sigaltstack is available and usable, but on the other side os_darwin.go does not setup the os-threads for sigaltstack due to not calling minitSignalStack.

This inconsistency now leads to the mentioned throw (noSignalStack) signal 16 received on thread with no signal stack. non-Go code disabled sigaltstack when an sigPreempt signal comes in. User signals send with kill -s seem not be an issue in my testings, because they are handled by a os-thread that has no G and therefore never reaches noSignalStack.

So as summary a gomobile MacCatalyst build uses GOOS=darwin and build-tags ios and in that case we have the following GOOS sources and values, leading to the crash during async preemption.

GOOS=darwin with build-tag ios leads to:

There are several possibilities to fix this issue:

  1. Swiching from GOOS=darwin to GOOS=ios in GoMobile for a MacCatalyst build. The resulting binary would then be a full ios binary, basically like it is now except the ASM code is also for ios and consistent.
  2. Removing the ios build-tag from GoMobile when building against MacCatalyst, this would mean #47228 needs to be fixed. So the binary is then really a darwin build and not an ios build but with a darwin ASM base.
  3. Making runtime.GOOS consistent with ASM defines #define GOOS_ in the Go compiler. IMHO this should probably discussed with the Go team regardless of how this issue is going to be fixed in GoMobile. Having those two ways to react on the GOOS should be consistent IMHO.
cherrymui commented 1 year ago

Thanks for the investigation. As you found, GOOS=darwin with build tag ios is a confusing and unsound setting and will result in broken builds. So solution 1 or 2 would be the way to Go. I'm not familiar with Catalyst. If Catalyst is more like iOS, we probably want to go with solution 1. If Catalyst is more like macOS, then we would go with solution 2.

We could probably guard the sigaltstack assembly code with build tags instead of macros, so it would also be consistent with runtime.GOOS. But I'm not sure it is worth doing. Maybe we could just make the build fail with GOOS=darwin and build tag ios, so the error is clear.

stffabi commented 1 year ago

For the time being it might be the easiest to switch from GOOS=darwin to GOOS=ios in gomobile, although Catalyst is more like macOS. Currently all people that used catalyst did use a build that behaved like GOOS=ios, because all differences depend on runtime.GOOS and only this sigaltstack handling depends on the macro. So maybe option 1 is the best for now, because that's what all people really used in the past when using a Catalyst build.

In a future release one could go with option 2 and fix #47228.

sbruens commented 8 months ago

We ran into this recently https://github.com/Jigsaw-Code/outline-client/issues/1828, also in a network extension. Thank you @stffabi for the wonderful and detailed write-up!

I tried a few workarounds based on the earlier comments but have been unsuccessful. Was anyone able to get around this issue or should we abandon gomobile for catalyst builds?

stffabi commented 8 months ago

@sbruens did you try my patched gomobile version https://github.com/stffabi/go-mobile/commit/e31230fb8e5d45adb0be5b69854b6b9bafb8151f? Install gomobile and gobind command from that commit and it should be fine.

I'm using that for more than half a year now, and it works fine in production.

gabemeola commented 4 months ago

Thanks @stffabi ! This worked great. Merged your commit into my fork.

Seems like gomobile could use some maintainer support — there are a few helpful features in PRs which haven't been reviewed in some time (tvos, etc)