implement async os commands

tehn commented 5 years ago

currently there are quite a few features that block. (wifi, file listings, etc).

UPDATE:

something like:

function callback_func(returned_txt)
  print(returned_txt)
end

util.cmd("ls ~", callback_func)

i can try this in c with a thread though i'm wondering how to parse the resulting txt as it may be long, ie not fit in an event data type. perhaps the txt gets sent line by line?

tehn commented 5 years ago

context: https://github.com/monome/norns/issues/779

catfact commented 5 years ago

i'm not sure what this issue is really about.

a coro is just a thing that can yield execution to it's calling context, then be resumed an arbitrary later time with it's old context (stack contents.) by itself, i don't think a coro can actually execute things concurrently.

for example, the clocks lib uses coroutine to create a seamless asynchronous syntax in lua , but requires the backing of system threads to actually resume the clock coros that have yielded.

tehn commented 5 years ago

you're right, this is poorly worded.

one thing that would make sense is to create an os call command which can be asynchronous... provide a command and a callback upon completion. this would pretty much solve the use cases where blocking is a real problem.

edit: updated OP

tehn commented 5 years ago

ahead of an attempt to solve this, considering some options...

should the os calls just be a queue? so that there's really only one user OS thread? otherwise it seems trickier to spawn threads for each call from lua, and track separate callback functions?

this however presents the issue of a time-consuming os process stalling out later calls, but i guess that's exactly what happens now with lua... but it could be more controlled in that it doesn't stall lua itself, so the behavior could be more predictable (and not stop metros (event processing) etc).

for the callback function, should the data sent back be line-by-line? along with perhaps with a command-finished flag? or should it wait til finished and throw a (potentially huge) chunk? lines would accommodate the event system better.

catfact commented 5 years ago

imho:

should the os calls just be a queue? so that there's really only one user OS thread?

yes. need OS worker thread with its own request queue.

should it wait til finished and throw a (potentially huge) chunk?

yes

i'd suggest something like (pseudocode):

//---- [from LVM main thread]

// called from lua; probably coroutine which yields after the call
function make_os_call_request: 
   allocate os_call_start_data:
       - string containing OS command, copied from lua stack
       - lua coroutine ID to resume when finished (cleaner than using callbacks directly) 
   add os_call_request_data pointer to work queue

// called from main event loop
function handle_os_call_response_event:
    copy os_call_response_data to lua stack (result string, coro ID)
    free os_call_response_data (pointer in main-loop event struct)
    call low-level lua handler, which will resume coro

//----- [from worker thread]

function handle_os_call_request:
   free os_call_request_datta
   ... do blocking work...
   allocate os_call_response_data:
      - results string (copy from child process stdout)
      - coroutine id (from request data)
  push new event to main loop with os_call_response_data

in other words: requests use dedicated OS work queue. responses use the main event queue with dynamic allocation. we already do this for other event types. with reasonable amounts of data it's fine. (not many megabytes or whatever)

catfact commented 5 years ago

i guess since it's a one-shot thing, coroutines don't actually give any benefits here over plain functions.

so lua would be somethin like:

( in OS module or whatever)

local os = {}
os.callbacks = {}

function os.requestCall(cmd, cb)
   local id = os.getCallbackId()
   os.callbacks[id] = cb
   _os_request(cmd, id) -- weaver function

-- called from main loop -> weaver -> low-level handler
function os.handleResponse(results, id)
   local cb = os.callbacks[id]
   if cb ~= nil then cb(results)

-- helper; returns a fresh coroutine ID 
os.current_id = 0
function os.getCbId()
   -- i dunno...
   local id = os.current_id
   id = id + 1
   if id > 9999 then id = 1
   return id

return os

tehn commented 5 years ago

yes indeed this seems highly sensible. i'll re-acquaint myself with the event queue payload sizing...

catfact commented 5 years ago

me too...

ok. there are several event types which allocate extra payload memory and pass the pointer in the event_data envelope. this is totally fine as long as it doesn't happen constantly. (well, it's as fine as using dynamic allocation for event_data in the first place, which isn't great, it should use an object pool.)

payload allocation happens when posting a new event. AFAIK the only places this actually happens right now are:

1) when posting a REPL code line for evaluation: https://github.com/monome/norns/blob/master/matron/src/input.c#L47 (...which isn't 1000% awesome but i guess it's fine.)

2) when posting an incoming OSC message: https://github.com/monome/norns/blob/master/matron/src/osc.c#L92

... this is a little freakier. i guess the lo_ functions are performing sneaky allocation and expect the user to deallocate. (i vaguely remember looking this up some other time.) but frankly i'm a little leery of those lines in the OSC rx and would like to double-check that for leaks.

(the other event-with-payload types are "data" and "waveform" polls which aren't actually used right now.)

payload deallocation happens from the main event loop after handling the event, along with envelope deallocation : https://github.com/monome/norns/blob/master/matron/src/events.c#L284

using the appropriate case for the event type here: https://github.com/monome/norns/blob/master/matron/src/events.c#L108

... so, i'd suggest sticking to that pattern. it might be cleaner to make a smarter function for posting events that takes care of event-type-specific payload allocation in the event module itself, so all event memory mgmt is in once place. (i guess if liblo always allocates then that's a no-go, which is maybe why it's structured the way it is.)

monome / norns

implement async os commands #779