Changes a lot of tasks to use the thread_stacksize::nostack stack size (i.e. no separate stack, thus no context switching or yielding either). I've manually opted tasks in to this rather than make it the default because the worst case is that we end up with deadlocks if we apply nostack to tasks that shouldn't have it. The worst that happens if we use a stack on tasks that don't need it is bad performance. Tasks that shouldn't use nostack are anything that might yield, meaning anything that calls barrier::arrive, semaphore::acquire, sync_wait, etc.
Changes a lot of tasks to use the
thread_stacksize::nostack
stack size (i.e. no separate stack, thus no context switching or yielding either). I've manually opted tasks in to this rather than make it the default because the worst case is that we end up with deadlocks if we applynostack
to tasks that shouldn't have it. The worst that happens if we use a stack on tasks that don't need it is bad performance. Tasks that shouldn't usenostack
are anything that might yield, meaning anything that callsbarrier::arrive
,semaphore::acquire
,sync_wait
, etc.Full benchmarking results are available here: https://confluence.cscs.ch/display/SCISWDEV/2023-11+stackless+threads. GEVP results on LUMI MC and Piz Daint MC:
LUMI:
Daint: