PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core heterogeneous architectures. PaRSEC assigns computation threads to the cores, GPU accelerators, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on architectural features such as NUMA nodes and algorithmic features such as data reuse.
Make sure the parsec context is started only after the taskpool is created as the taskpool will register active message handlers for the first time via the user-triggered option in PTG
Correctly compute the number of local tasks instead of assuming a run at a given world size; compute that number on all ranks and check that all ranks have a correct value
have all MPI rank return the same return code for consistency
Free the taskpool when it's not used anymore to remove assertion.
This is breaking existing capabilities. The termination detection capability should work in all cases, not only when then context is started after the taskpool creation.