getlantern / lantern-client

Lantern Client code
GNU General Public License v3.0
14 stars 3 forks source link

Fix signal handler #1130

Closed atavism closed 4 months ago

atavism commented 4 months ago

Fix: desktop app panics due to missing SA_ONSTACK signal

Resolves https://github.com/getlantern/engineering/issues/1504

Based on https://github.com/wailsapp/wails/pull/2152

jigar-f commented 4 months ago

Wooo, Thanks, I spent days searching for a fix.

jigar-f commented 4 months ago

@atavism It Seems like for me, the issue persists, I am still getting crashes.

signal 16 received but handler not on signal stack
mp.gsignal stack [0x14000084000 0x1400008c000], mp.g0 stack [0x16b7c0000 0x16b9c3000], sp=0x14000051078
fatal error: non-Go code set up signal handler without SA_ONSTACK flag

runtime stack:
runtime.throw({0x164965784?, 0x0?})
    runtime/panic.go:1023 +0x40 fp=0x14000050fd0 sp=0x14000050fa0 pc=0x163c3ecc0
runtime.sigNotOnStack(0x10, 0x14000051078, 0x14000080008)
    runtime/signal_unix.go:1065 +0x118 fp=0x14000051000 sp=0x14000050fd0 pc=0x163c59728
runtime.adjustSignalStack(0x10, 0x14000080008, 0x140000510a8)
    runtime/signal_unix.go:592 +0x25c fp=0x14000051070 sp=0x14000051000 pc=0x163c5858c
runtime.sigtrampgo(0x10, 0x14000051210, 0x14000051278)
    runtime/signal_unix.go:480 +0x8c fp=0x140000510f0 sp=0x14000051070 pc=0x163c580cc
runtime.sigtrampgo(0x10, 0x14000051210, 0x14000051278)
    <autogenerated>:1 +0x1c fp=0x14000051120 sp=0x140000510f0 pc=0x163c7e3fc
runtime.sigtramp()
    runtime/sys_darwin_arm64.s:227 +0x4c fp=0x140000511e0 sp=0x14000051120 pc=0x163c7cecc
atavism commented 4 months ago

@jigar-f Interesting, thanks. This is after running..

make darwin ffigen && flutter run -d macOS

?

atavism commented 4 months ago

@jigar-f Can you check if the app is still crashing for you with the latest changes I just pushed?

jigar-f commented 4 months ago

@atavism, still crashing. I generated new bindings as well, also if you want to recreate the crash, try running the application multiple times (I mean ran the app then stopped and again ran the app and stopped) you do that 2 or 3 times, you might get the crash.

atavism commented 4 months ago

@jigar-f Could you please let me know the contents of your settings.yaml? Also, any luck deleting the Lantern home directory (~/Library/Application\ Support/Lantern) first?

atavism commented 4 months ago

@jigar-f I think I found another bug. It looks like we weren't checking for missing dialers when initializing bandit the first time and may be attempting to dial with a nil Dialer:

sync.(*Mutex).Lock(...)
    sync/mutex.go:90
github.com/getlantern/cmux/v2.(*dialer).Dial(0x14001abaa80, {0x1592d1490, 0x14001356230}, {0x0, 0x0}, {0x0, 0x0})
    github.com/getlantern/cmux/v2@v2.0.0-20230301223233-dac79088a4c0/dialer.go:53 +0xc0 fp=0x14003f110d0 sp=0x14003f10fd0 pc=0x1587d0d10
github.com/getlantern/cmux/v2.(*dialer).Dial-fm({0x1592d1490?, 0x14001356230?}, {0x0?, 0x15913ce80?}, {0x0?, 0x158d1f56c?})
    <autogenerated>:1 +0x54 fp=0x14003f11120 sp=0x14003f110d0 pc=0x1587d1fc4
github.com/getlantern/flashlight/v7/chained.(*multiplexedImpl).dialServer(0x159bc6dc8?, 0x1591164e0?, {0x1592d1490?, 0x14001356230?})
    github.com/getlantern/flashlight/v7@v7.6.90/chained/multiplexed_impl.go:51 +0x44 fp=0x14003f11160 sp=0x14003f11120 pc=0x158843014
github.com/getlantern/flashlight/v7/chained.dialOrigin.func1(0x140013bd700?)
    github.com/getlantern/flashlight/v7@v7.6.90/chained/dialer.go:181 +0x3c fp=0x14003f11190 sp=0x14003f11160 pc=0x15883f2fc
github.com/getlantern/flashlight/v7/chained.(*proxy).reportedDial(0x140004ec000, 0x14003f11388)
    github.com/getlantern/flashlight/v7@v7.6.90/chained/proxy.go:522 +0xf0 fp=0x14003f11280 sp=0x14003f11190 pc=0x158849320
github.com/getlantern/flashlight/v7/chained.dialOrigin(0x14003f11470, {0x1592d1490?, 0x14001356230?}, 0x140004ec000, {0x158d0ce44, 0x7}, {0x14003e1f527, 0x19})
    github.com/getlantern/flashlight/v7@v7.6.90/chained/dialer.go:180 +0x6c fp=0x14003f113b0 sp=0x14003f11280 pc=0x15883ed9c
github.com/getlantern/flashlight/v7/chained.(*proxy).DialContext(0x140004ec000, {0x1592d1490, 0x14001356230}, {0x158d0ce44, 0x7}, {0x14003e1f527, 0x19})
    github.com/getlantern/flashlight/v7@v7.6.90/chained/dialer.go:137 +0x264 fp=0x14003f114b0 sp=0x14003f113b0 pc=0x15883e554
github.com/getlantern/flashlight/v7/bandit.(*BanditDialer).DialContext(0x14000170fc0, {0x1592d1490, 0x14001356230}, {0x158d0ce44, 0x7}, {0x14003e1f527, 0x19})
    github.com/getlantern/flashlight/v7@v7.6.90/bandit/bandit.go:130 +0x260 fp=0x14003f116b0 sp=0x14003f114b0 pc=0x158534180

I'm wondering if you have any luck with the changes I just pushed (in particular the update to flashlight here? Thanks

jigar-f commented 4 months ago

That's great, that we found another bug, But, It's still crashing for me, Also I feel like it is something to do other code than go. Here is the thread, But I am not so sure tho.

Also Here is the setting.yaml

addr: 127.0.0.1:57724
autoLaunch: true
autoReport: true
country: IN
googleAds: true
lang: en_us
localHTTPToken: ""
migratedDeviceIDForUserID: 0
proxyAll: false
salt: ""
socksAddr: 127.0.0.1:57726
systemProxy: true
uiAddr: ""
userFirstVisit: false
userID: 0
userLoggedIn: false
userPro: false
userToken: ""
jigar-f commented 4 months ago

Here is the method,

This can only happen if non-Go code called sigaction without setting the SS_ONSTACK flag.

// This is called if we receive a signal when there is a signal stack
// but we are not on it. This can only happen if non-Go code called
// sigaction without setting the SS_ONSTACK flag.
func sigNotOnStack(sig uint32, sp uintptr, mp *m) {
    println("signal", sig, "received but handler not on signal stack")
    print("mp.gsignal stack [", hex(mp.gsignal.stack.lo), " ", hex(mp.gsignal.stack.hi), "], ")
    print("mp.g0 stack [", hex(mp.g0.stack.lo), " ", hex(mp.g0.stack.hi), "], sp=", hex(sp), "\n")
    throw("non-Go code set up signal handler without SA_ONSTACK flag")
}
atavism commented 4 months ago

That's great, that we found another bug, But, It's still crashing for me, Also I feel like it is something to do other code than go. Here is the thread, But I am not so sure tho.

Ok, thanks for checking

atavism commented 4 months ago

Ok, I'm finally getting some useful information catching and handling the panic with panicwrap:

goroutine 209 gp=0x14000abbdc0 m=nil [select]:
runtime.gopark(0x14000a8af88?, 0x2?, 0x2?, 0x0?, 0x14000a8af54?)
    runtime/proc.go:402 +0xc8 fp=0x14000a8ae00 sp=0x14000a8ade0 pc=0x1680435b8
runtime.selectgo(0x14000a8af88, 0x14000a8af50, 0x1692d9118?, 0x0, 0x14000c81788?, 0x1)
    runtime/select.go:327 +0x614 fp=0x14000a8af10 sp=0x14000a8ae00 pc=0x168056b34
github.com/getlantern/lantern-client/desktop/ws.(*wsconn).write(0x14001525dd0)
    github.com/getlantern/lantern-client/desktop/ws/ws.go:197 +0x7c fp=0x14000a8afb0 sp=0x14000a8af10 pc=0x168b809fc
github.com/getlantern/lantern-client/desktop/ws.(*clientChannels).ServeHTTP.gowrap1()
    github.com/getlantern/lantern-client/desktop/ws/ws.go:107 +0x28 fp=0x14000a8afd0 sp=0x14000a8afb0 pc=0x168b7fd78
runtime.goexit({})
    runtime/asm_arm64.s:1222 +0x4 fp=0x14000a8afd0 sp=0x14000a8afd0 pc=0x16807dc34
created by github.com/getlantern/lantern-client/desktop/ws.(*clientChannels).ServeHTTP in goroutine 184
    github.com/getlantern/lantern-client/desktop/ws/ws.go:107 +0x3b4
atavism commented 4 months ago

Have a new PR for this https://github.com/getlantern/lantern-client/pull/1132