migueldeicaza / SwiftTerm

Xterm/VT100 Terminal emulator in Swift
MIT License
993 stars 146 forks source link

Rapid input causes LocalProcessTerminalView hosting REPL processes to hang #294

Closed nkleemann closed 1 year ago

nkleemann commented 1 year ago

I'm using your package in SwiftUI on macOS to host a GHCI process. When rapidly sending large amounts of text to LocalProcessTerminalView, the view becomes unresponsive.

Steps to Reproduce

  1. Create an instance of LocalProcessTerminalView and start a REPL process like GHCI.
  2. Rapidly send large chunks of text to the instance.

Expected Behavior

The REPL process within the LocalProcessTerminalView should properly handle the incoming data and remain responsive, regardless of the input rate or size.

Actual Behavior

After rapidly sending large amounts of text, the LocalProcessTerminalView freezes and the hosted REPL process becomes unresponsive to further text input (manual or via send).


I've noticed the implementation of the send method uses DispatchIO. I'm wondering if it might be beneficial to provide a mechanism to notify the caller when the send operation is complete. This might potentially enable us to ensure that send is not called while it's already running, particularly in the context of a SwiftUI @MainActor situation. This could be one route towards a resolution, although of course I understand there may be complexities that I'm not aware of.

migueldeicaza commented 1 year ago

I am unable to reproduce this. I suspect there is a race condition and this is why you are able to reproduce and not me.

Would you be so kind to stop the process when it hangs and get the stack traces for all the threads involved?

nkleemann commented 1 year ago

Your are most certainly right about the problem being a race condition. I'm building an editor for music live coding. Users can, at any rate, send multiline strings of any size into a GHCI process I'm hosting with LocalProcessTerminalView. As it's send method uses DispatchIO, and the hang occurs every time when I'm calling send too often with large code blocks to send into GHCI, I think this is most certainly a threading problem.

Here's the function I'm using that calls .send:

func runBlock() async {
    guard await bootTidal() else {
        log("tidal session was stopped")
        return
    }
    let blockContent = self.getCurrentBlockContent()        
    let formattedBlockContent = ":{\n\(blockContent)\n:}\n"

    // todo: find a way to get a callback when this finishes, or queue calls here
    self.ghciSessionView?.send(txt: formattedBlockContent)
}

One potential solution from my end could be to implement a queue, but it would greatly simplify things if callers of send could be notified upon its completion or if they could pass a completion handler along with the text.

In your opinion, would this be a useful feature for other situations as well, and do you think it could justify a change on your end? Or perhaps you could recommend another way to address this issue?

migueldeicaza commented 1 year ago

oh the issue is you are sending the data as opposed to the reading of data?

That could explain why I didn’t reproduce. How much data is being sent?

This might be even be simpler, just a deadlock based on buffer sizes. Annoying to address, but easier to reason about.

nkleemann commented 1 year ago

I ran a quick test, with this multiline string (tidal cycles expression):

let block = """
:{
do
    setcps (150/60/4)
    d1 $ every 3 rev $ fast 8 $ note "<a2 d3> <c d ds g a>" # s "plaits"
        # engine (slow 8 $ run 16)
        # sustain (slow 8 $ sine)
        # cut 1
        # level (slow 8 $ range 0.1 0.2 sine)
        # lpf "[300|800|400]"
        # mu 3
        # delay 0.2
        # delaytime "0.3333"
        # delayfeedback 0.1
    d2 $ n "[[~ ~ ~ 0 ~ ~ 0 ~], [4*4|4*16]]" # s "gpu2"
        # speed 0.86
        # lpf "1000 1200 1400"
        # gain 1.1
    d3 $ every 3 (rev) $ loopAt 2 $ splice 8 "0 <1 6 [3|2]> 2 [3|5] 7 0 5 [1|2]" $ s "break:9"
        # pan 0.4
        # hpf 200
        # room 0.2
        # sz 0.91
        # gain 0.7
        # cut 1
    d4 $ degrade $ note "{a5 d6}%2"
        # s "hs1:6"
        # gverb 1
        # gsize 200
        # gdelay 20
        # gfdbk 0.92
        # gain 0.5
        # lpf "{3000 2000 1000}%3"
        # hpf 300
        |-| note "{12 7 0}%3"
:}
"""

When I called .send(txt: block) 4 times, with a short delay between the calls (50-200ms) with this string its enough to make the REPL process unresponsive. I think it's more than a rendering issue because not only does the LocalProcessTerminalView UI doesn't respond anymore, the connection to the spawned process (GHCI in my case) also seems to be lost. I can see the GHCI process running in activity monitor, but I can't send input to it anymore.

It's important to note that any size of input is okay, as long as you don't try to call send n times in quick succession. n seems to be directly proportional to the input size though 😁

migueldeicaza commented 1 year ago

Thanks, let me try to reproduce

migueldeicaza commented 1 year ago

Ok, I tried something like that, but can not reproduce.

I think the best thing to do is get a trace of all the threads once it get stuck. In Xcode, pause the program from the debug menu and then on the debugger console type the bt all command:

(lldb) bt all
nkleemann commented 1 year ago

Here's the stack trace just after the terminal view freezes:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x0000000191dabf14 libsystem_kernel.dylib`mach_msg2_trap + 8
    frame #1: 0x0000000191dbe240 libsystem_kernel.dylib`mach_msg2_internal + 80
    frame #2: 0x0000000191db4b78 libsystem_kernel.dylib`mach_msg_overwrite + 604
    frame #3: 0x0000000191dac290 libsystem_kernel.dylib`mach_msg + 24
    frame #4: 0x0000000191eca8b8 CoreFoundation`__CFRunLoopServiceMachPort + 160
    frame #5: 0x0000000191ec9198 CoreFoundation`__CFRunLoopRun + 1208
    frame #6: 0x0000000191ec858c CoreFoundation`CFRunLoopRunSpecific + 612
    frame #7: 0x000000019b6fddf4 HIToolbox`RunCurrentEventLoopInMode + 292
    frame #8: 0x000000019b6fdc30 HIToolbox`ReceiveNextEventCommon + 648
    frame #9: 0x000000019b6fd988 HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 76
    frame #10: 0x00000001950e7f58 AppKit`_DPSNextEvent + 636
    frame #11: 0x00000001950e70f4 AppKit`-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 716
    frame #12: 0x00000001950db558 AppKit`-[NSApplication run] + 464
    frame #13: 0x00000001950b29a8 AppKit`NSApplicationMain + 880
    frame #14: 0x00000001ba121974 SwiftUI`___lldb_unnamed_symbol82468 + 148
    frame #15: 0x00000001bb280fa0 SwiftUI`___lldb_unnamed_symbol207338 + 164
    frame #16: 0x00000001bab0243c SwiftUI`static SwiftUI.App.main() -> () + 128
    frame #17: 0x0000000104b93ea8 TidalConduit`static TCApp.$main(self=TidalConduit.TCApp) at TCApp.swift:10:1
    frame #18: 0x0000000104b94308 TidalConduit`main at TCApp.swift:0
    frame #19: 0x0000000191a93f28 dyld`start + 2236
  thread #4, name = 'gputools.smt_poll.0x600002e3edc0'
    frame #0: 0x0000000191daf50c libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x0000000191c902d0 libsystem_c.dylib`nanosleep + 220
    frame #2: 0x0000000191c901e8 libsystem_c.dylib`usleep + 68
    frame #3: 0x00000001094439dc GPUToolsCapture`smt_poll_thread_entry(void*) + 108
    frame #4: 0x0000000108c6555c libsystem_pthread.dylib`_pthread_start + 148
  thread #7, name = 'caulk.messenger.shared:17'
    frame #0: 0x0000000191dabe90 libsystem_kernel.dylib`semaphore_wait_trap + 8
    frame #1: 0x000000019b40a024 caulk`caulk::semaphore::timed_wait(double) + 212
    frame #2: 0x000000019b409ed8 caulk`caulk::concurrent::details::worker_thread::run() + 36
    frame #3: 0x000000019b409bc8 caulk`void* caulk::thread_proxy<std::__1::tuple<caulk::thread::attributes, void (caulk::concurrent::details::worker_thread::*)(), std::__1::tuple<caulk::concurrent::details::worker_thread*> > >(void*) + 96
    frame #4: 0x0000000108c6555c libsystem_pthread.dylib`_pthread_start + 148
  thread #8, name = 'com.apple.audio.toolbox.AUScheduledParameterRefresher'
    frame #0: 0x0000000191dabe90 libsystem_kernel.dylib`semaphore_wait_trap + 8
    frame #1: 0x000000019b40a024 caulk`caulk::semaphore::timed_wait(double) + 212
    frame #2: 0x000000019b409ed8 caulk`caulk::concurrent::details::worker_thread::run() + 36
    frame #3: 0x000000019b409bc8 caulk`void* caulk::thread_proxy<std::__1::tuple<caulk::thread::attributes, void (caulk::concurrent::details::worker_thread::*)(), std::__1::tuple<caulk::concurrent::details::worker_thread*> > >(void*) + 96
    frame #4: 0x0000000108c6555c libsystem_pthread.dylib`_pthread_start + 148
  thread #9, name = 'com.apple.NSEventThread'
    frame #0: 0x0000000191dabf14 libsystem_kernel.dylib`mach_msg2_trap + 8
    frame #1: 0x0000000191dbe240 libsystem_kernel.dylib`mach_msg2_internal + 80
    frame #2: 0x0000000191db4b78 libsystem_kernel.dylib`mach_msg_overwrite + 604
    frame #3: 0x0000000191dac290 libsystem_kernel.dylib`mach_msg + 24
    frame #4: 0x0000000191eca8b8 CoreFoundation`__CFRunLoopServiceMachPort + 160
    frame #5: 0x0000000191ec9198 CoreFoundation`__CFRunLoopRun + 1208
    frame #6: 0x0000000191ec858c CoreFoundation`CFRunLoopRunSpecific + 612
    frame #7: 0x0000000195212508 AppKit`_NSEventThread + 172
    frame #8: 0x0000000108c6555c libsystem_pthread.dylib`_pthread_start + 148
  thread #14
    frame #0: 0x0000000108c6fa64 libsystem_pthread.dylib`start_wqthread
  thread #15
    frame #0: 0x0000000191dadbc8 libsystem_kernel.dylib`__workq_kernreturn + 8
  thread #16
    frame #0: 0x0000000108c6fa64 libsystem_pthread.dylib`start_wqthread
  thread #17
    frame #0: 0x0000000000000000

If this doesn't help I'll try to come up with a minimal example we can both run and verify in the coming days. Do you have GHC installed? If not I'll try it with a Python REPL.

migueldeicaza commented 1 year ago

Mhm. This stack trace does not show a deadlock.

nkleemann commented 1 year ago

I've come up with this quick&dirty example using a python REPL. On my machine I can reproduce the issue every time after pressing the button. You can tweak send call delay time and numbers in the Constants struct if you can't.

import SwiftUI
import SwiftTerm

struct Constants {

    static let python3Path = "/usr/bin/python3"
    // delay between .send calls in seconds
    static let sendDelay = 0.05
    // number of times to trigger .send
    static let sendCount = 4
    // dummy python code to send into repl
    static let blockContent = """
    import time

    def is_prime(n):
        if n <= 1:
            return False
        elif n <= 3:
            return True
        elif n % 2 == 0 or n % 3 == 0:
            return False
        i = 5
        while i * i <= n:
            if n % i == 0 or n % (i + 2) == 0:
                return False
            i += 6
        return True

    def print_first_n_primes(n):
        count = 0
        num = 2
        while count < n:
            if is_prime(num):
                print(num)
                count += 1
            num += 1

    start = time.time()
    print_first_n_primes(10000)
    end = time.time()
    print(f'Time taken: {end - start} seconds')
    """
}

@main
struct FreezingREPLApp: App {

    @StateObject var replViewManager = REPLViewManager()

    var body: some Scene {
        WindowGroup {
            ContentView()
                .environmentObject(replViewManager)
        }
    }
}

struct ContentView: View {

    @EnvironmentObject var replViewManager: REPLViewManager

    var body: some View {

        VStack {
            REPLView()

            Button(action: {
                replViewManager.sendTextToREPL()
            }) {
                Text("freeze the repl")
            }
        }
        .padding()
    }
}

class REPLViewManager: ObservableObject {

    private var replView: LocalProcessTerminalView?

    func setupREPLView(_ view: LocalProcessTerminalView) {
        view.font = .monospacedSystemFont(ofSize: 10, weight: .regular)
        replView = view
        replView?.startProcess(executable: Constants.python3Path)
    }

    func sendTextToREPL() {
        guard let repl = replView else {
            print("REPLView is not initialized")
            return
        }

        for i in 0..<Constants.sendCount {
            DispatchQueue.main.asyncAfter(deadline: .now() + Constants.sendDelay * Double(i)) {
                repl.send(txt: Constants.blockContent)
            }
        }
    }
}

struct REPLView: NSViewRepresentable {

    @EnvironmentObject var replViewManager: REPLViewManager

    func makeCoordinator() -> Coordinator {
        Coordinator(self)
    }

    func makeNSView(context: Context) -> LocalProcessTerminalView {
        let view = LocalProcessTerminalView(frame: CGRect(x: 0, y: 0, width: 200, height: 200))
        replViewManager.setupREPLView(view)
        return view
    }

    func updateNSView(_ nsView: LocalProcessTerminalView, context: Context) {}

    class Coordinator: NSObject {
        var parent: REPLView

        init(_ parent: REPLView) {
            self.parent = parent
        }
    }
}

After pressing the button I can see some output in the terminal - until it quickly stops. Then it becomes unresponsive to further input. If you watch the Activity Monitor before and after clicking the button you'll see that the python process doesn't crash - but somehow the LocalProcessTerminalView looses access to it (I guess).

migueldeicaza commented 1 year ago

This is fabulous! Thank you, I’ll take a look!

migueldeicaza commented 1 year ago

Fixed: your test case was invaluable, thank you so much for taking the time to put it together.

nkleemann commented 1 year ago

I can only give that back, thank you for the quick fix!