mratsim / weave

A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
Other
532 stars 22 forks source link

Is there a manual or tutorial for weave? #183

Closed dawnmy closed 1 year ago

dawnmy commented 1 year ago

I am learning NIM and found weave is great for task and data parallelism. But looking at the examples in he github readme, I am still quite confused how to use weave. I was wondering is there a more detailed manual or tutorial for weave? And will weave replace the threadpool module in NIM in the near future?

mratsim commented 1 year ago

There is none at the moment.

The 2 main uses of Weave are Task Parallelism and Data Parallelism.

If you used async, async API across all languages were inspired by how task parallelism was done in 1995 by Cilk. Nim default threadpool also works similarly.

https://github.com/mratsim/weave#task-parallelism

import weave

proc fib(n: int): int =
  # int64 on x86-64
  if n < 2:
    return n

  let x = spawn fib(n-1)
  let y = fib(n-2)

  result = sync(x) + y

proc main() =
  var n = 20

  init(Weave)
  let f = fib(n)
  exit(Weave)

  echo f

main()

Basically you use spawn to (maybe) delegate some function to another thread, you get a handle for its result. And you use sync when you need the result, that will "block" the current thread of execution until the result is available. While "blocked" that thread will process other tasks in the threadpool, it's not idle.

The second API is Data parallelism; also known as parallel-for. This is what is mainly used in scientific computing to work on large vectors and matrices for examples.

import weave

func initialize(buffer: ptr UncheckedArray[float32], len: int) =
  for i in 0 ..< len:
    buffer[i] = i.float32

proc transpose(M, N: int, bufIn, bufOut: ptr UncheckedArray[float32]) =
  ## Transpose a MxN matrix into a NxM matrix with nested for loops

  parallelFor j in 0 ..< N:
    captures: {M, N, bufIn, bufOut}
    parallelFor i in 0 ..< M:
      captures: {j, M, N, bufIn, bufOut}
      bufOut[j*M+i] = bufIn[i*N+j]

proc main() =
  let M = 200
  let N = 2000

  let input = newSeq[float32](M*N)
  # We can't work with seq directly as it's managed by GC, take a ptr to the buffer.
  let bufIn = cast[ptr UncheckedArray[float32]](input[0].unsafeAddr)
  bufIn.initialize(M*N)

  var output = newSeq[float32](N*M)
  let bufOut = cast[ptr UncheckedArray[float32]](output[0].addr)

  init(Weave)
  transpose(M, N, bufIn, bufOut)
  exit(Weave)

main()

Weave is really tuned towards compute tasks, as explained here https://nim-lang.org/blog/2021/02/26/multithreading-flavors.html there are many cases where weave is not the proper tool, for example handling network requests. Those requires fairness and low latency, not high throughput.

The library that corresponds more to replacement is nim-taskpools: https://github.com/status-im/nim-taskpools

dawnmy commented 1 year ago

Thank you for your reply and the examples.