Hammerspoon / hammerspoon

Staggeringly powerful macOS desktop automation with Lua
http://www.hammerspoon.org
MIT License
12.02k stars 584 forks source link

Coroutines don't work in Hammerspon #2306

Closed asmagill closed 4 years ago

asmagill commented 4 years ago

At all.

With the following from a coroutine tutorial (https://riptutorial.com/lua/example/11751/create-and-use-a-coroutine):

thread2 = coroutine.create(function()
        for n = 1, 5 do
            print("honk "..n)
            coroutine.yield()
        end
    end)

as soon as I enter coroutine.resume(thread2), I get:

false   NSInvalidArgumentException: NSConcreteAttributedString initWithString:: nil value
(
    0   CoreFoundation                      0x00007fff4bcb7acd __exceptionPreprocess + 256
    1   libobjc.A.dylib                     0x00007fff763b9a17 objc_exception_throw + 48
    2   CoreFoundation                      0x00007fff4bcb78ff +[NSException raise:format:] + 201
    3   Foundation                          0x00007fff4dec59b6 -[NSConcreteAttributedString initWithString:] + 222
    4   Foundation                          0x00007fff4dee2392 -[NSConcreteAttributedString initWithString:attributes:] + 27
    5   internal.so                         0x0000000115fa369e console_printStyledText + 3198
    6   LuaSkin                             0x000000010de00107 luaD_precall + 5239
    7   LuaSkin                             0x000000010dde63fd luaV_execute + 142605
    8   LuaSkin                             0x000000010de06f19 resume + 1065
    9   LuaSkin                             0x000000010dcd91ac luai_objcttry + 428
    10  LuaSkin                             0x000000010ddf8ed3 luaD_rawrunprotected + 1155
    11  LuaSkin                             0x000000010de056ce lua_resume + 4334
    12  LuaSkin                             0x000000010dd7231a auxresume + 186
    13  LuaSkin                             0x000000010dd71ce9 luaB_coresume + 121
    14  LuaSkin                             0x000000010de00107 luaD_precall + 5239
    15  LuaSkin                             0x000000010dde6a57 luaV_execute + 144231
    16  LuaSkin                             0x000000010de0415c luaD_call + 236
    17  LuaSkin                             0x000000010de04532 luaD_callnoyield + 194
    18  LuaSkin                             0x000000010dd9d86a f_call + 410
    19  LuaSkin                             0x000000010dcd91ac luai_objcttry + 428
    20  LuaSkin                             0x000000010ddf8ed3 luaD_rawrunprotected + 1155
    21  LuaSkin                             0x000000010de0a031 luaD_pcall + 1089
    22  LuaSkin                             0x000000010dd9c0e8 lua_pcallk + 4872
    23  LuaSkin                             0x000000010ddf5ffd luaB_xpcall + 189
    24  LuaSkin                             0x000000010de00107 luaD_precall + 5239
    25  LuaSkin                             0x000000010dde63fd luaV_execute + 142605
    26  LuaSkin                             0x000000010de0415c luaD_call + 236
    27  LuaSkin                             0x000000010de04532 luaD_callnoyield + 194
    28  LuaSkin                             0x000000010dd9d86a f_call + 410
    29  LuaSkin                             0x000000010dcd91ac luai_objcttry + 428
    30  LuaSkin                             0x000000010ddf8ed3 luaD_rawrunprotected + 1155
    31  LuaSkin                             0x000000010de0a031 luaD_pcall + 1089
    32  LuaSkin                             0x000000010dd9c0e8 lua_pcallk + 4872
    33  LuaSkin                             0x000000010dca4de5 -[LuaSkin protectedCallAndTraceback:nresults:] + 709
    34  Hammerspoon                         0x000000010d63d21b MJLuaRunString + 1147
    35  Hammerspoon                         0x000000010d631889 -[MJConsoleWindowController run:] + 345
    36  Hammerspoon                         0x000000010d631d26 -[MJConsoleWindowController tryMessage:] + 1014
    37  AppKit                              0x00007fff494e8644 -[NSApplication(NSResponder) sendAction:to:from:] + 312
    38  AppKit                              0x00007fff49552992 -[NSControl sendAction:to:] + 86
    39  AppKit                              0x00007fff494e468f -[NSTextField textDidEndEditing:] + 924
    40  Hammerspoon                         0x000000010d636288 -[HSGrowingTextField textDidEndEditing:] + 584
    41  CoreFoundation                      0x00007fff4bc64346 __CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER__ + 12
    42  CoreFoundation                      0x00007fff4bc642c0 ___CFXRegistrationPost_block_invoke + 63
    43  CoreFoundation                      0x00007fff4bc6422a _CFXRegistrationPost + 404
    44  CoreFoundation                      0x00007fff4bc6c6d8 ___CFXNotificationPost_block_invoke + 87
    45  CoreFoundation                      0x00007fff4bbd5064 -[_CFXNotificationRegistrar find:object:observer:enumerator:] + 1642
    46  CoreFoundation                      0x00007fff4bbd4417 _CFXNotificationPost + 732
    47  Foundation                          0x00007fff4de5ba7b -[NSNotificationCenter postNotificationName:object:userInfo:] + 66
    48  AppKit                              0x00007fff495f4efb -[NSTextView(NSPrivate) _giveUpFirstResponder:] + 415
    49  AppKit                              0x00007fff49521ca6 -[NSTextView doCommandBySelector:] + 194
    50  AppKit                              0x00007fff49521bba -[NSTextInputContext(NSInputContext_WithCompletion) doCommandBySelector:completionHandler:] + 228
    51  AppKit                              0x00007fff49516e1a -[NSKeyBindingManager(NSKeyBindingManager_MultiClients) interpretEventAsCommand:forClient:] + 2972
    52  AppKit                              0x00007fff49be2fdf __84-[NSTextInputContext _handleEvent:options:allowingSyntheticEvent:completionHandler:]_block_invoke_5 + 341
    53  AppKit                              0x00007fff49be2e75 __84-[NSTextInputContext _handleEvent:options:allowingSyntheticEvent:completionHandler:]_block_invoke_3.784 + 74
    54  AppKit                              0x00007fff4951df23 -[NSTextInputContext tryHandleEvent_HasMarkedText_withDispatchCondition:dispatchWork:continuation:] + 87
    55  AppKit                              0x00007fff49be2dfb __84-[NSTextInputContext _handleEvent:options:allowingSyntheticEvent:completionHandler:]_block_invoke.781 + 237
    56  HIToolbox                           0x00007fff4aec7efb __TSMProcessRawKeyEventWithOptionsAndCompletionHandler_block_invoke_5 + 70
    57  HIToolbox                           0x00007fff4aec6db9 ___ZL23DispatchEventToHandlersP14EventTargetRecP14OpaqueEventRefP14HandlerCallRec_block_invoke + 110
    58  AppKit                              0x00007fff49bdd70e __55-[NSTextInputContext handleTSMEvent:completionHandler:]_block_invoke.265 + 575
    59  AppKit                              0x00007fff49518512 __55-[NSTextInputContext handleTSMEvent:completionHandler:]_block_invoke_2 + 74
    60  AppKit                              0x00007fff4951849a -[NSTextInputContext tryHandleTSMEvent_HasMarkedText_withDispatchCondition:dispatchWork:continuation:] + 87
    61  AppKit                              0x00007fff49517c92 -[NSTextInputContext handleTSMEvent:completionHandler:] + 1749
    62  AppKit                              0x00007fff49517545 _NSTSMEventHandler + 306
    63  HIToolbox                           0x00007fff4ae5d22e _ZL23DispatchEventToHandlersP14EventTargetRecP14OpaqueEventRefP14HandlerCallRec + 1422
    64  HIToolbox                           0x00007fff4ae5c5df _ZL30SendEventToEventTargetInternalP14OpaqueEventRefP20OpaqueEventTargetRefP14HandlerCallRec + 371
    65  HIToolbox                           0x00007fff4ae5c465 SendEventToEventTargetWithOptions + 45
    66  HIToolbox                           0x00007fff4aec3f8f SendTSMEvent_WithCompletionHandler + 383
    67  HIToolbox                           0x00007fff4aec43fa __SendUnicodeTextAEToUnicodeDoc_WithCompletionHandler_block_invoke + 387
    68  HIToolbox                           0x00007fff4aec4268 __SendFilterTextEvent_WithCompletionHandler_block_invoke + 221
    69  HIToolbox                           0x00007fff4aec3fde SendTSMEvent_WithCompletionHandler + 462
    70  HIToolbox                           0x00007fff4aec3de3 SendFilterTextEvent_WithCompletionHandler + 225
    71  HIToolbox                           0x00007fff4aec3aa4 SendUnicodeTextAEToUnicodeDoc_WithCompletionHandler + 280
    72  HIToolbox                           0x00007fff4aec384e __utDeliverTSMEvent_WithCompletionHandler_block_invoke_2 + 283
    73  HIToolbox                           0x00007fff4aec36ad __utDeliverTSMEvent_WithCompletionHandler_block_invoke + 355
    74  HIToolbox                           0x00007fff4aec34cb TSMKeyEvent_WithCompletionHandler + 598
    75  HIToolbox                           0x00007fff4aec325a __TSMProcessRawKeyEventWithOptionsAndCompletionHandler_block_invoke_4 + 250
    76  HIToolbox                           0x00007fff4aec3089 __TSMProcessRawKeyEventWithOptionsAndCompletionHandler_block_invoke_3 + 257
    77  HIToolbox                           0x00007fff4aec2dce __TSMProcessRawKeyEventWithOptionsAndCompletionHandler_block_invoke_2 + 282
    78  HIToolbox                           0x00007fff4aec2b32 __TSMProcessRawKeyEventWithOptionsAndCompletionHandler_block_invoke + 274
    79  HIToolbox                           0x00007fff4aec2127 TSMProcessRawKeyEventWithOptionsAndCompletionHandler + 3398
    80  AppKit                              0x00007fff49be2cd9 __84-[NSTextInputContext _handleEvent:options:allowingSyntheticEvent:completionHandler:]_block_invoke_3.779 + 110
    81  AppKit                              0x00007fff49be22d6 __204-[NSTextInputContext tryTSMProcessRawKeyEvent_orSubstitution:dispatchCondition:setupForDispatch:furtherCondition:doubleSpaceSubstitutionCondition:doubleSpaceSubstitutionWork:dispatchTSMWork:continuation:]_block_invoke.734 + 115
    82  AppKit                              0x00007fff49be21c7 -[NSTextInputContext tryTSMProcessRawKeyEvent_orSubstitution:dispatchCondition:setupForDispatch:furtherCondition:doubleSpaceSubstitutionCondition:doubleSpaceSubstitutionWork:dispatchTSMWork:continuation:] + 245
    83  AppKit                              0x00007fff49be2892 -[NSTextInputContext _handleEvent:options:allowingSyntheticEvent:completionHandler:] + 1286
    84  AppKit                              0x00007fff49be2086 -[NSTextInputContext _handleEvent:allowingSyntheticEvent:] + 107
    85  AppKit                              0x00007fff49516004 -[NSView interpretKeyEvents:] + 209
    86  AppKit                              0x00007fff49515e2d -[NSTextView keyDown:] + 726
    87  AppKit                              0x00007fff49363367 -[NSWindow(NSEventRouting) _reallySendEvent:isDelayedEvent:] + 6840
    88  AppKit                              0x00007fff49361667 -[NSWindow(NSEventRouting) sendEvent:] + 478
    89  AppKit                              0x00007fff49201889 -[NSApplication(NSEvent) sendEvent:] + 2953
    90  AppKit                              0x00007fff491ef5c0 -[NSApplication run] + 755
    91  AppKit                              0x00007fff491deac8 NSApplicationMain + 777
    92  Hammerspoon                         0x000000010d62b3d2 main + 34
    93  libdyld.dylib                       0x00007fff77b883d5 start + 1
)

I suspected this might happen if the coroutine actually invoked anything that used LuaSkin, but had forgotten that we no longer use lua_pcall anywhere by itself -- it's always invoked indirectly through LuaSkin's protectedCallAndTraceback:nresults:.

As I'm coming to understand it:

Which is half-way to making the whole thing support multi-threading (at the Application level, not the psuedo co-routine lua level) anyways (see point 3 at https://stackoverflow.com/a/54046050).

So while the fix might be a "good idea"™, implementing it will be a giant PITA.

Unless I'm missing something obvious (somebody smarter please tell me that I am! Please, please, please!)

asmagill commented 4 years ago

Ok, not quite "don't work at all"...

if the code which creates and uses the coroutine is in one block, and doesn't use anything which involves LuaSkin (i.e. it's pure lua, not even a single function we've added that uses checkArgs:), then it will work:

This works (when pasted into the console as one "line" of input):

a = ""
cothread2 = coroutine.create(function()
    for n = 1, 5 do
        a = a .. "honk "..n.."\n"
        coroutine.yield()
    end
end)
while coroutine.status(cothread2) ~= "dead" do coroutine.resume(cothread2) end
print(a)

This crashes (when pasted into the console as one "line" of input) -- yeah, it's a stupid example, but hs.canCheckForUpdates was the first thing I found that only uses LuaSkin for a single [skin checkArgs:LS_TBREAK];, about the simplest usage I could think of):

a = ""
cothread2 = coroutine.create(function()
    for n = 1, 5 do
        a = a .. tostring(hs.canCheckForUpdates()) .. "\n"
        coroutine.yield()
    end
end)
while coroutine.status(cothread2) ~= "dead" do coroutine.resume(cothread2) end
print(a)

edit: made it more clear that blocks should be pasted in as one entry if testing in the console

asmagill commented 4 years ago

Obviously few/none of our users currently utilize coroutines, or we'd have heard something by now, but it is a feature lua promotes, so its only a matter of time...

Possible short term band-aids that I can think of:

In any case, if we do end up adding #2304, hs_yield can't be invoked within a coroutine because we haven't implemented lua_lock and lua_unlock (yet); but I think I can easily make this detect such a situation by checking lua_isyieldable() so this issue isn't a show-stopper (though it does remove the need for one of the tests I was considering)

More thoughts if I think of them...

latenitefilms commented 4 years ago

I've tried to use coroutines in the past, but it just made my head hurt so I gave up. These crashes could explain some of the issues I was running into at the time.

asmagill commented 4 years ago

Ok, after some more pondering...

lua_lock and lua_unlock actually shouldn't be necessary, if we stay on a single application thread...

We'll still need to make LuaSkin multi lua_State* compatible for coroutine support (which is also a necessary step for supporting multiple application threads), but since only a single "lua thread" will ever be active at a given instant in a singe application threaded model, lua_lock/unlock don't help us.


@cmsj, some ideas I'd like your comments on about how to make LuaSkin support different lua_State values (these both assume adding sharedWithState:(lua_State *L) and changing functions in our modules -- but not in delegates or other callbacks -- to use it instead of the shared contructor):

  1. add (readonly?) property to store initial lua_State into during creation; when shared called, set self._L to this value; when sharedWithState:L called, set self._L to the L passed in. Pros: simple, Cons: may have to call sharedWithState:L multiple times within a function if it's possible the current value of skin._L might have changed -- currently (I think) this only affects the proposed hs_yield, but maybe also others, if we ever write C functions which utilize/participate in coroutines; won't be sufficient if we ever add true multi application thread support.

  2. create (and store to avoid re-creating?) new LuaSkin instance for each new lua_State seen. Pros: muti application thread safe (if/when we decide to go there), as long as no single lua_State is allowed to run on more than one thread at a time; Cons: more changes to LuaSkin because registered type helpers, etc. have to be moved to class properties; how do we detect when lua_thread is garbage collected (so we can remove its LuaSkin instance, if we saved it)? How do we handle a reload if multiple instances still exist (if we can attach a __gc to a lua_thread type, this might be a non-issue, but I don't think we can)

Other possibilities I'm missing?


Re multiple application threads -- If we do end up adding hs.yield, and maybe even consider attaching it to a debug hook so it runs after every n lua calls, the extra effort to support multiple application threads becomes less attractive to me. IMHO the times when yielding isn't sufficient might be better served by moving something into Objective-C itself which is a lot faster and most things become easy (well, easier) to multithread and trigger a callback upon completion.

Just my (current) 2 cents... ask me again in a month and I'll probably think something totally different 😜

asmagill commented 4 years ago
  1. modify the lua5.3 source for coroutine.yield and coroutine.resume so that they change LuaSkin's internal property for lua_State; Pros: simplest of all requiring no (or very minimal) changes to existing modules; Cons: patching lua source further, which means may have to revisit with 5.4, not sure yet how nested coroutines can be, may need our own stack like structure to remember what to set the property back to. Potentially screwing up something that none of us are that familiar with, so will we really notice subtle changes/errors? Would complicate supporting multiple application threads, if that is something we decide to do later.
cmsj commented 4 years ago

If I’m reading this right, it seems like we’ve broken coroutines by building LuaSkin (and really everything else) to assume that there is only ever one lua_State object. That being the case, is it actually a good idea to try and remove such a core assumption?

asmagill commented 4 years ago

In short, yes... though it's still probably a good idea to keep the main lua thread's state, maybe as a class property, so callbacks/delegates/other places where no lua state is passed in don't have to jump through too many hoops. As long as we make sure that the primary state always runs on the main application thread (even if we decide to allow application threading at a later date), they'll run one at a time and shouldn't pose a problem.

But otherwise, yes, we really should support multiple states everywhere else.

cmsj commented 4 years ago

I'm pretty wary of making this kind of change for a Lua feature that's apparently so unpopular that nobody has noticed it's been broken for over 5 years.

For the pure C entrypoints (ie callbacks/delegates), how would we do anything other than fetch the main state object? As-in, how would the C entrypoint even know which state object it should try and use? If it could find the right state object, how would it know whether or not the co-routine that created it, still exists?

Even assuming we could figure that sort of stuff out, this would also mean some fairly significant changes to LuaSkin - not only do we only have one lua_State object at the moment, we intentionally only have one LuaSkin object no matter how many times [LuaSkin shared] is called. Would each new co-routine state object need to inherit all of the registered modules/helpers?

I take the point that it's bad we've broken co-routines, but I'm honestly not sure I want to fix them again, if the cost is significantly higher complexity, and a big risk of breaking in all sorts of weird new ways.

latenitefilms commented 4 years ago

FWIW - I guess my question would be WHY do we need coroutines, and is there a better way to achieve the same outcome with another solution? Most of the things I'd use coroutines for can be implemented in other ways, such as running things on another thread in Objective-C-land and using a callback.

For now, I'd just make coroutines = nil and document it somewhere, so it's clear that coroutines aren't supported in Hammerspoon. If someone else NEEDS them in the future, we can always re-assess.

sharpobject commented 4 years ago

Hello,

Coroutines are a fundamental control flow construct that ships with the language. Removing them is comparable to removing the function keyword. I only recently learned that Hammerspoon exists, and I am willing to try to fix the mess that currently prevents it from working with coroutines and with moonjit.

asmagill commented 4 years ago

Callbacks should always run on the main lua_State (the lua docs call it a thread, which I've sometimes called a "lua thread" to be pedantic, but it's really primarily a separate isolated lua stack) --the initial lua_State created when the lua engine was initialized -- because they are a new code-block triggered by the external event, not the resuming of something a coder suspended.

That's why I recommend creating the new method sharedWithState: and applying it only to functions and methods we add, but not within the callback/delegate/event code that handles externally triggered application events. If we go with the first option I outline above, the internal changes to LuaSkin are manageable and there is still only one instance required.

As to why its taken so long to find out... well, most of our users aren't programmers per se, and those that are didn't start with Lua... coroutines are a fundamental part of the language, and the primary answer to the fact that it's by default single threaded. I've meant to learn about them for a long time, but since (1) potentially long running lua, no matter whether its on 1 "lua thread" or 1000 causes Hammerspoon to block, but with #2304 that might no longer be an issue, and (2) I personally can code relatively quickly in Objective-C and can use a background thread it if I want.

If a true lua developer wanted to do something, they'd expect coroutines to exist and wouldn't bother without them.

As I have come to understand the problem, the fix outlined as the first option above won't be difficult, just tedious.

I'm opposed to burying our head in the sand by removing that the library -- we won't be using Lua, then, but a subset of it. Subsets are fine in memory constrained environments, not on a modern desktop.

As a first step, I think adding the following after the NSThread check in the sharedWithDelegate: method of LuaSkin will at least stop the application crashes, though 90% of our module functions/methods still won't be usable within a coroutine:

    int isMainThread = lua_pushthread(self.L) ;
    lua_pop(self.L, 1) ; // remove thread we just pushed onto the stack
    if (!isMainThread) {
        luaL_error(self.L, "LuaSkin shared object incompatible with coroutines") ;
        return nil ;
    }

I'll test this in a few hours to be certain.

asmagill commented 4 years ago

Ok, no, that's stupid... we'd just be checking the saved state, which we already know is the main "thread"... let me think some more...

sharpobject commented 4 years ago

If the limitation is only that you cannot call Hammerspoon API functions from coroutines, you may be able to "fix" the problem from the Lua side by implementing asymmetric coroutines on top of symmetric coroutines on top of asymmetric coroutines, placing a trampoline inside of your new fake coroutine.resume, and wrapping Hammerspoon API functions to make sure they are called from the main lua_thread even when the user tries to use them from inside a coroutine.

A sketch of what this would look like, on the library side, which I've called co_wrap.lua over here:

local select = select
local table_insert = table.insert
local table_remove = table.remove
-- utility to save multiple returns or varargs for later use
local NIL = {}
local function drop_one(x, ...)
  return ...
end
local save_exprs
save_exprs = function(return_first_x, ...)
  if return_first_x == 0 then
    local ret = {}
    for i=1,select("#", ...) do
      local item = select(i, ...)
      if item == nil then
        item = NIL
      end
      ret[i] = item
    end
    return ret
  end
  return select(1, ...), save_exprs(return_first_x-1, drop_one(...))
end
local unsave_exprs
unsave_exprs = function(t, idx, len)
  if not idx then
    idx = 0
    len = #t
  end
  if idx == len then
    return
  end
  idx = idx + 1
  local ret = t[idx]
  if ret == NIL then
    ret = nil
  end
  return ret, unsave_exprs(t, idx, len)
end

local co_stack = {}
local co_running = {}
local n_cos = 0
local real_yield = coroutine.yield
local real_resume = coroutine.resume
local real_status = coroutine.status
local real_create = coroutine.create
function coroutine.resume(co, ...)
  -- can't have the same coroutine running twice in our fake stack of coroutines 
  if co_running[co] then
    return false, "cannot resume non-suspended coroutine"
  end
  -- if we are called from the main thread, resume and trampoline
  if n_cos == 0 then
    n_cos = n_cos + 1
    co_stack[n_cos] = co
    co_running[co] = true
    local ok, op
    local stuff = save_exprs(0, co, ...)
    while true do
      local prev_co = stuff[1]
      ok, op, stuff = save_exprs(2, real_resume(unsave_exprs(stuff)))
      if ok and op == "resume" then
        -- trampoline and pass stuff to nested co
        n_cos = n_cos + 1
        co_stack[n_cos] = stuff[1]
        co_running[stuff[1]] = true
      elseif ok and (op == "yield" or op == "die") then
        -- coroutine yielded or died
        -- trampoline and pass stuff to parent co
        co_running[prev_co] = nil
        co_stack[n_cos] = nil
        n_cos = n_cos - 1
        if n_cos == 0 then
          -- parent co is topmost lua thread, so return
          return true, unsave_exprs(stuff)
        else
          table_insert(stuff, 1, true)
          table_insert(stuff, 1, co_stack[n_cos])
        end
      elseif ok and op == "call" then
        -- coroutine wants to call some function from the main thread
        -- get the results and pass them back to same co
        local func = table_remove(stuff, 1)
        stuff = save_exprs(0, func(unsave_exprs(stuff)))
        table_insert(stuff, 1, co_stack[n_cos])
      elseif not ok then
        -- coroutine errored
        -- trampoline and pass stuff to parent co
        co_running[prev_co] = nil
        co_stack[n_cos] = nil
        n_cos = n_cos - 1
        if n_cos == 0 then
          -- parent co is topmost lua thread, so return
          return false, op, unsave_exprs(stuff)
        else
          table_insert(stuff, 1, op)
          table_insert(stuff, 1, false)
          table_insert(stuff, 1, co_stack[n_cos])
        end
      else
        error("Fake coroutine lib: Not sure what happened")
      end
    end
  else
    -- tell the loop on the main thread that I'd like to resume pls
    return real_yield("resume", co, ...)
  end
end

function coroutine.yield(...)
  -- tell the loop on the main thread that I'd like to yield pls
  return real_yield("yield", ...)
end

function coroutine.status(co)
  if co_running[co] then
    return "running"
  else
    return real_status(co)
  end
end

function coroutine.create(f)
  return real_create(function(...)
    -- tell the loop on the main thread that I'd like to die pls
    return "die", f(...)
  end)
end

function coroutine.call_from_main_thread(f)
  return function(...)
    if n_cos == 0 then
      return f(...)
    else
      -- tell the loop on the main thread that I'd like to call pls
      return real_yield("call", f, ...)
    end
  end
end

--TODO: also reimplement coroutine.wrap

--coroutine.running should work fine as is.

On the user side:


require"co_wrap"
get_main_thread = coroutine.call_from_main_thread(coroutine.running)
get_my_thread = coroutine.running
a = coroutine.create(function()
    local cos = {}
    local function print_some_numbers(n)
      for i=1,n do print("inner", i, coroutine.yield(i*i)) end
      print(get_main_thread())
      print(get_my_thread())
    end
    for i=1,3 do
      cos[i] = coroutine.create(print_some_numbers)
    end
    local n = 1
    while true do
      for i=1, #cos do
        if coroutine.status(cos[i]) ~= "dead" then
          print("outer", coroutine.resume(cos[i],n))
          n = n + 1
        end
      end
      print(coroutine.yield())
    end
  end)

for i=1,5 do print(coroutine.resume(a)) end

This prints

outer   true    1
outer   true    1
outer   true    1
true

inner   1   4
thread: 0x7f901b800608  true
thread: 0x7f901b408c08  false
outer   true
inner   1   5
outer   true    4
inner   1   6
outer   true    4
true

inner   2   7
thread: 0x7f901b800608  true
thread: 0x7f901b408f88  false
outer   true
inner   2   8
outer   true    9
true

inner   3   9
thread: 0x7f901b800608  true
thread: 0x7f901b408928  false
outer   true
true

true

If you remove lines 8 and 9, then you can get output that doesn't change based on the addresses of coroutines, and you can try running it with and without co_wrap. Over here it produces identical output with and without co_wrap on lua 5.3.5. This isn't a guarantee that I haven't written any bugs, of course.

Edit: While this may let users run code that uses coroutines, it won't be as fast as you might like. If you'd like to go fast, I think the right play is to fix the problem for real and to use some version of luajit.

asmagill commented 4 years ago

Maybe... If I'm understanding this correctly (can't test it right now, not at my computer, so this is all though experiments for me for the next couple of hours or so), it still puts the onus on the user to know what can and can't be run within a coroutine, and if they get it wrong, the Hammerspoon application crashes.

Long term, we need to implement sharedWithState: in LuaSkin.

Short term, I need a way to detect, without having access the the lua_State passed into a C function, whether or not the state stored in the LuaSkin properties matches it or not...

Let me make that clearer, cuz I only understand it because I'm thinking about it right now and it still sounds like mud to me...

For the following, the "active" lua_State is the one passed into the C function by the lua engine.

Lets say I have a C function accessible from lua:

static int my_func(lua_State *L) {
    LuaSkin *skin = [LuaSkin shared] ; // get the shared LuaSkin instance
    [skin checkArgs:LS_TNUMBER, LS_TBREAK] ; // check the args on the stack
    ... do our stuff
    return 1;
}

[skin checkArgs:...] does some type checking on the stack passed in... internally it uses a lot of lua_type(...) and lua_to...(...) function calls, but all of them are done with skin.L as the state, not the state that was passed into my_func. And this is what causes the application crash when skin.L != L.

We could add if (skin.L != L) { return luaL_error(L, "not coroutine safe") ; } after the first line, but we'd have to do it for every function and method we've added, and that's a crap-ton. Plus it doesn't make them work in coroutines, when simply changing skin.L for the duration of this C function would, so I'm willing to do the crap-ton once to switch [LuaSkin shared] to [LuaSkin sharedWithState:L] but not twice: once to make it safer (i.e. non-crashy), then again to make it work with coroutines, once I've made the necessary changes to LuaSkin itself.

It would be so much easier to test whether the "active" lua_State == skin.L within [LuaSkin shared], but as you can see, that's not passed L, and so far, rummaging through the source for lua, I'm not seeing a way to determine which luaState is the "active" one (i.e. the one that initiated the most recent `lua(p)call`).

To complicate matters, when handling a callback from the macOS, LuaSkin *skin = [LuaSkin shared] ; gets invoked when the lua engine is idle and NO lua_(p)call has been performed (so there is no "active" state)... now granted, in this case, a coroutine didn't cause the callback, so using skin.L on the main lua thread is fine, but it does complicate determining which lua_State is "active", assuming I can find a way at all...

sharpobject commented 4 years ago

Hammerspoon could wrap all the functions that must be wrapped before exposing them to the user. I'm not sure if you have a big list of those somewhere though.

sharpobject commented 4 years ago

I don't think there's an easy way to find the active state, but there is an easy way to find the main state. So it's too bad you're not asking the opposite question of "I have an active state, and can you tell me if it's the main one?"

It seems like to solve the problem for real, checkArgs should accept a lua_State.

asmagill commented 4 years ago

There are actually a lot of other helpers that LuaSkin provides, so I think the better solution is to change the first line to LuaSkin *skin = [LuaSkin sharedWithState:L] in our functions, but leave it LuaSkin *skin = [LuaSkin shared] in our callback handlers...

Internally, if withState is called, skin.L is set to that, and if not (or NULL is passed, which might be clearer in intent), it will set it to the main one which we can cache in a class property.

That requires the least changes (outside of LuaSkin itself) but still leaves open the possibility for a crash if one our developers uses the wrong version of shared in the wrong place, which is why I was hoping for a way to detect it...

Oh well, I think then moving all of them to sharedWithState: is the way to go, but allow it to be NULL; that way, using [LuaSkin shared] won't compile and the dev will have to make an explicit choice.

@cmsj, not sure if you've been following these last comments, but unless you have objections, I think this is what I'm going to start doing on a new branch tonight/tomorrow so we can put it out for testing hopefully in a couple of days.

cmsj commented 4 years ago

@asmagill I am following along. I have no objections to trying to fix this, but I have some fairly big concerns about the viability of this.

I hope I'm wrong, but my suspicion is that we're going to very quickly find that all sorts of things are going to break - e.g. we don't just have an assumption that there is only one lua_State object, we also assume there's only one LuaSkin object which is 100% coupled to the single Lua state. We assume that each library can have its own global refTable and that assumption is now on very shaky ground. Those are just the ones that come immediately to mind.

tl;dr I don't currently see how any of our C API can be safe when it's being interacted with by multiple Lua states.

asmagill commented 4 years ago

I think we’re OK as long as we stick to one application thread, since every function (with the exception of the proposed hs.yield and I think I know how to address that) runs to completion before another L could sneak in there.

A few of the functions which offer to run in the background but don’t return a userdata will have to be examined to make sure they ask for a new skin object rather than inherit their parents before the dispatch_async, but I’ve been meaning to re-evaluate these anyways because they are also prone to crashing if the callback doesn’t trigger till after a reload, so...

At any rate, the only way to know is to try it; we can keep it in a branch so only devs run it for a while if you like.

cmsj commented 4 years ago

@asmagill I think I'm mostly concerned about all the state in C that will now have multiple Lua environments mutating it.

I agree that the only way to be sure is to try, so maybe if we have a branch with the modifications to LuaSkin and then we need to hammer the HS API from the coroutines and the main thread and see what happens.

asmagill commented 4 years ago

Admittedly this is a sort of band-aid over a framework that should probably (eventually) be refactored now that we've realized that one of the core assumptions that went into making it was mistaken, but I think it should work, once we've addressed all dispatch_* blocks that assume the persistence of skin and don't bother setting it again...

I'll keep you guys posted and let you know when a fork is ready for testing.

asmagill commented 4 years ago

Closing as this has been addressed in master; reopen if you think I've missed some point that still needs to be addressed.