Use stackless threads where appropriate

msimberg commented 11 months ago

Changes a lot of tasks to use the thread_stacksize::nostack stack size (i.e. no separate stack, thus no context switching or yielding either). I've manually opted tasks in to this rather than make it the default because the worst case is that we end up with deadlocks if we apply nostack to tasks that shouldn't have it. The worst that happens if we use a stack on tasks that don't need it is bad performance. Tasks that shouldn't use nostack are anything that might yield, meaning anything that calls barrier::arrive, semaphore::acquire, sync_wait, etc.

Full benchmarking results are available here: https://confluence.cscs.ch/display/SCISWDEV/2023-11+stackless+threads. GEVP results on LUMI MC and Piz Daint MC:

LUMI: gevp_strong_time_20480_lumi_mc_nostack

Daint: gevp_strong_time_20480

msimberg commented 11 months ago

cscs-ci run

msimberg commented 11 months ago

cscs-ci run

msimberg commented 11 months ago

cscs-ci run

eth-cscs / DLA-Future

Use stackless threads where appropriate #1037