Idea: Annotate the fn cost

Let a user annotate the cost of a function to decide if it makes sense spawn or not.

Example:

var m = createMaster()

proc spm(): auto {.opUS:100.} = discard ## expect it takes ~0010 ns
proc baz(): auto {.opUS:050.} = discard ## expect it takes ~0020 ns
proc bar(): auto {.opUS:010.} = discard ## expect it takes ~0100 ns
proc foo(): auto {.opUS:001.} = discard ## expect it takes ~1000 ns

m.awaitAll:
  for i in 0 .. 1_000_000:
    m.spawn spm()
  m.spawn baz()
  m.spawn bar()
  m.spawn foo()

POC:

malebolgia.nim (full file)

import std / macros

template opUS(i:int = 1)  {.pragma.}  # operations / microsecond

type
  Master* = object ## Masters can spawn new tasks inside an `awaitAll` block.
    ...
    budget: Atomic[int]

const
  ThreadBudget*{.intdefine.} = 1000 ## 1us avg time in nanoseconds we loose to scheduling a task

macro getOpUs(fn: typed; cp: typed{nkSym}): int =
  for p in fn[0].getImpl().pragma:
    if p.len > 0 and p[0].kind == nnkSym and p[0] == cp:
      return p[1]
  return newLit(1)

func fnCost(opUs: int): int = 1000 div opUs

template spawnImplNoRes(master: var Master; fn: typed) =
  let cost = getOpUS(fn, opUS).fnCost
  let budget = master.budget.fetchSub(cost) - cost
  if budget > 0 or stillHaveTime(master):
    if budget < 0 and shouldSend(master):
      master.budget.store(ThreadBudget)
      taskCreated master
      send PoolTask(m: addr(master), t: toTask(fn), result: nil)
    else:
      fn

The idea is developer specifying the cost, not change the ThreadBudget. But for benchmarking, compile it with -d:ThreadBudget:1 (1ns) to have more spawns, -d:ThreadBudget:1000 (1000ns) to have less spawns. My local results:

-d:ThreadBudget:00001 ~ 1.400sec (spawn after burn 001ns of our budget)
-d:ThreadBudget:00010 ~ 0.700sec (spawn after burn 010ns of our budget)
-d:ThreadBudget:00100 ~ 0.100sec (spawn after burn 100ns of our budget)
-d:ThreadBudget:01000 ~ 0.030sec (spawn after burn 001us of our budget)
-d:ThreadBudget:10000 ~ 0.010sec (spawn after burn 010us of our budget)

TODO:

budget should be per MasterHandle not per master, README dfs as example.
The operation per micro second (opUS) is confusing
Review the stillHaveTime logic.
Sane defaults (according to old local test spawn takes ~1us)

Alternative:

The initial idea was to use the effect system to track 'IO', and only spawn if function does IO... Or use the effect system to sum up the costs (have no idea if it is possible)

Araq / malebolgia

Idea: Annotate the fn cost #35