mratsim / weave

A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
Other
532 stars 22 forks source link

Can Weave spawn a thread that sends a Channel message to the main channel without blocking the main thread? #198

Open PhilippMDoerner opened 6 months ago

PhilippMDoerner commented 6 months ago

Heyho, I am trying to get an example to work where I spawn a thread to perform a task in weave that eventually sends a Channel message, while the proc itself returns nothing.

I want to go for a Channel message as I want a setup with a thread running an event-loop of a GUI that regularly reads messages send to it from possibly multiple other "Task"-threads.

It should not ever block waiting for any of the "Task"-threads to finish as it should run its own event-loop and stay responsive to user-input.

The code would look roughly like this (if spawn only spawned a single thread for that one proc call):

import weave
import std/os

var chan: Channel[string]
chan.open()

proc doThing() {.gcsafe.} =
  sleep(1000) # Very compute intensive task or one where you waitFor an async operation like an HTTP request
  chan.send("Sending")

proc guiLoop() =
  var counter: int

  while true: # A GUI loop
    let resp = chan.tryRecv
    counter.inc
    if resp.dataAvailable:
      echo resp.msg
      echo "Counter: ", counter
      break

    sleep(10)

proc main() =
  init(Weave)

  spawn doThing()

  guiLoop()

  exit(Weave)

main()

Now obviously the above does not work. I'm not entirely sure what it does, but I'm very sure it spawns more than 1 thread as it maxes out my 16 core CPU on top of never even starting to execute doThing.

Is there a way to express that with weave?

I initially assumed that this fell under "task parallelism" since I would want to do 2 tasks in parallel (run gui thread, execute "doThing"), but I'm really not that knowledgeable with the terminology thrown around.

mratsim commented 6 months ago

Weave provides isReady, see https://github.com/nim-lang/RFCs/issues/347#check-if-a-result-has-been-computed-required.

That said Weave internals use "work-requesting" instead of "work-stealing" so load balancing require cooperation between threads. If you offload something that blocks, for example that does IO like writing to a file or stdout, you may block the whole runtime.

Also I've added an experimental API called submit to allow submitting tasks from any thread to improve interop with async, and ensure execution is done on a thread independent from async, see https://github.com/mratsim/weave/blob/master/rfcs/multithreading_apis.md#experimental-non-blocking-task-parallelism-api.

I do not think it's the best way forward, ideally we have a threadpool built for IO, but it at least exist.

PhilippMDoerner commented 6 months ago

If I understand approaches using isReady correctly, that would require me to manage any task I create myself, throughout the multiple iterations of the while-loop that may occur on Thread A between a task being spawned on Thread A and it finishing on Thread B. E.g. I'd need to have a thread-local queue for Thread A into which it puts all tasks that it generates, and at the end of each loop It checks every task in the queue if it's ready. If it is, then remove it from the queue, take its result and handle it with an appropriate proc.

That's doable, but sounds like considerable overhead. By having a task simply send messages back to the main-thread via a channel I circumvent a significant chunk of that. A main-thread with an event-loop like here (see guiLoop) is forced to check for messages on its channel regardless since it could be receiving many messages from not just the task, but also other threads (thinking of it like an actor model) etc.

mratsim commented 5 months ago

If I understand approaches using isReady correctly, that would require me to manage any task I create myself, throughout the multiple iterations of the while-loop that may occur on Thread A between a task being spawned on Thread A and it finishing on Thread B. E.g. I'd need to have a thread-local queue for Thread A into which it puts all tasks that it generates, and at the end of each loop It checks every task in the queue if it's ready. If it is, then remove it from the queue, take its result and handle it with an appropriate proc.

That's doable, but sounds like considerable overhead. By having a task simply send messages back to the main-thread via a channel I circumvent a significant chunk of that. A main-thread with an event-loop like here (see guiLoop) is forced to check for messages on its channel regardless since it could be receiving many messages from not just the task, but also other threads (thinking of it like an actor model) etc.

Can't you do the following:

proc foo(ctx: ptr Context, arg0: T, arg1: U, arg2: V): bool =
  # ... heavy processing
  ctx.signalReady() # The context has the taskID

Otherwise your problem is quite similar to the design constraint I had for implementing dataflow parallelism with events or many events that trigger dependent spawns: https://github.com/mratsim/weave/blob/b6255afa5816ee431dbf2f59cc6bc605d8d657b8/weave/parallel_tasks.nim#L367-L397

You can read the implementation here: https://github.com/mratsim/weave/blob/master/weave/cross_thread_com/flow_events.nim it's multithreading runtime agnostic and only need threadsafe MPSC queues.