kwhat / libuiohook

A multi-platform C library to provide global keyboard and mouse hooks from userland.
Other
493 stars 123 forks source link

Deadlock when used with NodeJS NAPI module #188

Open 0x11-dev opened 6 months ago

0x11-dev commented 6 months ago

dispatch_sync_f(dispatch_get_main_queue(), XX, XX) in here line 280 will make a deadlock situation. To resolve this, we need to make a copy of the CGEventRef with CGEventRef eventCopy = CGEventCreateCopy(event); instead of using a getter from the main thread.

related issue: #https://github.com/SnosMe/uiohook-napi/issues/23#issuecomment-2016526280

The call graph is as follows:

Call graph:
    2562 Thread_21998699   DispatchQueue_1: com.apple.main-thread  (serial)
    + 2562 start  (in dyld) + 1942  [0x7ff8001b93a6]
    +   2562 node::Start(int, char**)  (in node) + 239  [0x10a9abfcf]
    +     2562 node::NodeMainInstance::Run()  (in node) + 115  [0x10aa206d3]
    +       2562 node::NodeMainInstance::Run(int*, node::Environment*)  (in node) + 97  [0x10aa20a51]
    +         2562 node::SpinEventLoop(node::Environment*)  (in node) + 291  [0x10a918743]
    +           2562 uv_run  (in node) + 417  [0x10b385ba1]
    +             2562 uv__io_poll  (in node) + 932  [0x10b398994]
    +               2562 kevent  (in libsystem_kernel.dylib) + 10  [0x7ff80050772e]
    2562 Thread_21998739
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Run()  (in node) + 361  [0x10aa49529]
    +       2562 uv_run  (in node) + 417  [0x10b385ba1]
    +         2562 uv__io_poll  (in node) + 932  [0x10b398994]
    +           2562 kevent  (in libsystem_kernel.dylib) + 10  [0x7ff80050772e]
    2562 Thread_21998740
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 node::(anonymous namespace)::PlatformWorkerThread(void*)  (in node) + 379  [0x10aa4668b]
    +       2562 node::TaskQueue<v8::Task>::BlockingPop()  (in node) + 72  [0x10aa49748]
    +         2562 uv_cond_wait  (in node) + 9  [0x10b3933e9]
    +           2562 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
    +             2562 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7ff80050560e]
    2562 Thread_21998741
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 node::(anonymous namespace)::PlatformWorkerThread(void*)  (in node) + 379  [0x10aa4668b]
    +       2562 node::TaskQueue<v8::Task>::BlockingPop()  (in node) + 72  [0x10aa49748]
    +         2562 uv_cond_wait  (in node) + 9  [0x10b3933e9]
    +           2562 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
    +             2562 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7ff80050560e]
    2562 Thread_21998742
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 node::(anonymous namespace)::PlatformWorkerThread(void*)  (in node) + 379  [0x10aa4668b]
    +       2562 node::TaskQueue<v8::Task>::BlockingPop()  (in node) + 72  [0x10aa49748]
    +         2562 uv_cond_wait  (in node) + 9  [0x10b3933e9]
    +           2562 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
    +             2562 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7ff80050560e]
    2562 Thread_21998743
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 node::(anonymous namespace)::PlatformWorkerThread(void*)  (in node) + 379  [0x10aa4668b]
    +       2562 node::TaskQueue<v8::Task>::BlockingPop()  (in node) + 72  [0x10aa49748]
    +         2562 uv_cond_wait  (in node) + 9  [0x10b3933e9]
    +           2562 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
    +             2562 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7ff80050560e]
    2562 Thread_21998756
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 node::inspector::(anonymous namespace)::StartIoThreadMain(void*)  (in node) + 19  [0x10aab25a3]
    +       2562 uv_sem_wait  (in node) + 16  [0x10b3939e0]
    +         2562 semaphore_wait_trap  (in libsystem_kernel.dylib) + 10  [0x7ff8005029ea]
    2562 Thread_21998757
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 worker  (in node) + 89  [0x10b381549]
    +       2562 uv_cond_wait  (in node) + 9  [0x10b3933e9]
    +         2562 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
    +           2562 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7ff80050560e]
    2562 Thread_21998758
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 worker  (in node) + 89  [0x10b381549]
    +       2562 uv_cond_wait  (in node) + 9  [0x10b3933e9]
    +         2562 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
    +           2562 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7ff80050560e]
    2562 Thread_21998759
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 worker  (in node) + 89  [0x10b381549]
    +       2562 uv_cond_wait  (in node) + 9  [0x10b3933e9]
    +         2562 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
    +           2562 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7ff80050560e]
    2562 Thread_21998760
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 worker  (in node) + 89  [0x10b381549]
    +       2562 uv_cond_wait  (in node) + 9  [0x10b3933e9]
    +         2562 _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
    +           2562 __psynch_cvwait  (in libsystem_kernel.dylib) + 10  [0x7ff80050560e]
    2562 Thread_21998821
    + 2562 thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
    +   2562 _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
    +     2562 hook_thread_proc  (in node.napi.node) + 109  [0x10f81194d]  uiohook_worker.c:108
    +       2562 hook_run  (in node.napi.node) + 1048  [0x10f813a68]  input_hook.c:1389
    +         2562 CFRunLoopRun  (in CoreFoundation) + 40  [0x7ff800696d9e]
    +           2562 CFRunLoopRunSpecific  (in CoreFoundation) + 557  [0x7ff80061b352]
    +             2562 __CFRunLoopRun  (in CoreFoundation) + 2700  [0x7ff80061c3a6]
    +               2562 __CFRunLoopDoSource1  (in CoreFoundation) + 534  [0x7ff80061d72e]
    +                 2562 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__  (in CoreFoundation) + 41  [0x7ff80061d7f7]
    +                   2562 __CFMachPortPerform  (in CoreFoundation) + 238  [0x7ff80064bb1c]
    +                     2562 eventTapMessageHandler(__CFMachPort*, void*, long, void*)  (in SkyLight) + 151  [0x7ff805cd3795]
    +                       2562 _XPostEventTapData  (in SkyLight) + 290  [0x7ff805f0daf9]
    +                         2562 processEventTapData(void*, unsigned int, unsigned int, unsigned int, unsigned char*, unsigned int)  (in SkyLight) + 598  [0x7ff805cd3a49]
    +                           2562 hook_event_proc  (in node.napi.node) + 2469  [0x10f812a05]  input_hook.c:0
    +                             2562 process_key_pressed  (in node.napi.node) + 418  [0x10f812eb2]  input_hook.c:286
    +                               2562 _dispatch_sync_f_slow  (in libdispatch.dylib) + 175  [0x7ff8003a72ef]
    +                                 2562 __DISPATCH_WAIT_FOR_QUEUE__  (in libdispatch.dylib) + 307  [0x7ff8003a76c3]
    +                                   2562 _dispatch_thread_event_wait_slow  (in libdispatch.dylib) + 40  [0x7ff80039ac3a]
    +                                     2562 _dlock_wait  (in libdispatch.dylib) + 46  [0x7ff80039adb2]
    +                                       2562 __ulock_wait  (in libsystem_kernel.dylib) + 10  [0x7ff800504222]
    2562 Thread_21998851
      2562 start_wqthread  (in libsystem_pthread.dylib) + 15  [0x7ff80053db97]
        2562 _pthread_wqthread  (in libsystem_pthread.dylib) + 416  [0x7ff80053eca0]
          2562 __workq_kernreturn  (in libsystem_kernel.dylib) + 10  [0x7ff800504192]

Total number in stack (recursive counted multiple, when >=5):
        11       _pthread_start  (in libsystem_pthread.dylib) + 99  [0x7ff800542202]
        11       thread_start  (in libsystem_pthread.dylib) + 15  [0x7ff80053dbab]
        8       __psynch_cvwait  (in libsystem_kernel.dylib) + 0  [0x7ff800505604]
        8       _pthread_cond_wait  (in libsystem_pthread.dylib) + 1211  [0x7ff80054276b]
        8       uv_cond_wait  (in node) + 9  [0x10b3933e9]

Sort by top of stack, same collapsed (when >= 5):
        __psynch_cvwait  (in libsystem_kernel.dylib)        20496
        kevent  (in libsystem_kernel.dylib)        5124
        __ulock_wait  (in libsystem_kernel.dylib)        2562
        __workq_kernreturn  (in libsystem_kernel.dylib)        2562
        semaphore_wait_trap  (in libsystem_kernel.dylib)        2562
kwhat commented 4 months ago

To resolve this, we need to make a copy of the CGEventRef with CGEventRef eventCopy = CGEventCreateCopy(event); instead of using a getter from the main thread.

I am probably never going to be able to reproduce this issue so I will need to rely on you to both provide a solution and test that solution to ensure it works. When you say I need to make a copy of the event, are we talking about the event_ref passed to the function and assigned tis_keycode_message->event = event_ref;? Let me know and I can put the fix in.

rohitsangwan01 commented 4 months ago

@kwhat Am trying to port this library for Dart using version 1.3.0, and getting same issue, however works fine after commenting out this and this line

GeorgeBarlow commented 2 months ago

@kwhat Am trying to port this library for Dart using version 1.3.0, and getting same issue, however works fine after commenting out this and this line

This works for me, but can we get some sort of idea regarding the status of this, is it something that can be looked into?

kwhat commented 2 months ago

We definitely cant comment out those two lines and expect key typed events to work on macOS. It's running into a deadlock because its trying to run that code on the primary runloop and obviously these other languages are using that runloop for something else...

Did either of the other solutions mentioned in this ticket work? We should probably figure out why dispatch_sync_f is blocking indefinitely for the main run loop. I suspect node, and probably dart are doing something that blocks the execution on this thread and unfortunately Apple's TIS functions can only be run from that thread.

GeorgeBarlow commented 2 months ago

We definitely cant comment out those two lines and expect key typed events to work on macOS. It's running into a deadlock because its trying to run that code on the primary runloop and obviously these other languages are using that runloop for something else...

Did either of the other solutions mentioned in this ticket work? We should probably figure out why dispatch_sync_f is blocking indefinitely for the main run loop. I suspect node, and probably dart are doing something that blocks the execution on this thread and unfortunately Apple's TIS functions can only be run from that thread.

I'll look into trying out the other solutions now, but for context, I'm using C++ and I found this issue to still arise in a sandbox like environment. However, since the C demo is working, I'll have to re-check my implementation to ensure it's not an issue on my end.

rohitsangwan01 commented 2 months ago

I have built the dart wrapper for this library, And by using that solution i mentioned above it works fine in Dart as well as Flutter https://github.com/rohitsangwan01/uiohook_dart/tree/main/uiohook_dart This happens on mac only

GeorgeBarlow commented 2 months ago

@kwhat I rewrote my implementation of libuiohook and it seems to run perfectly now, tried to keep it quite basic and it seems to have done the trick so for me I have no more issues (I think I must've been doing something dodgy with threading).

However, and I'm happy to start another issue if need be, I want to ask if you have had any issues regarding debugging libuiohook on Windows? To reproduce I'm using Visual Studio 2022 on Windows and attached the debugger to the demo_hook example and have hit 'break all' during libuiohook's event loop and it has brought my computer to a standstill. Is this something you've witnessed at all?

kwhat commented 2 months ago

I rewrote my implementation of libuiohook and it seems to run perfectly now, tried to keep it quite basic and it seems to have done the trick so for me I have no more issues (I think I must've been doing something dodgy with threading).

That's good to hear. I have tried to keep as much of the threading out of this library as possible for a number of reasons and as a result of that the event callback always executes on the thread you started to hook from. If you need to do work on the callback, it's probably best to copy the memory to another thread and do the work there. The only other threading happens is on macOS because the TIS function calls have to happen on the main runloop for undocumented reasons from Apple.

There are a number of upstream bugs about debugging for windows: #412 #264 #232 and #137. One of these has a possible solution in it where on Windows a debugger attachment can be detected and the hook could be stopped. The same problem should exist on macOS in theory and will also present itself on Linux/Unix as soon as I get the Wayland code in place. I have not tested the proposed solution but if you want to give it a try please open a new bug and we can start with Windows.