PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core heterogeneous architectures. PaRSEC assigns computation threads to the cores, GPU accelerators, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on architectural features such as NUMA nodes and algorithmic features such as data reuse.
Using PaRSEC commit 7f81a1b (2023-01-19) and DPLASMA commit 75012ef3f (2023-01-23), the m, n, and k data in the trace of a GEMM is not present when using the dtd interface.
To Reproduce
Add the --force-profile flag to PARSEC_PTGPP_FLAGS in parsec/CMakeLists.txt
Using PaRSEC commit
7f81a1b
(2023-01-19) and DPLASMA commit75012ef3f
(2023-01-23), the m, n, and k data in the trace of a GEMM is not present when using the dtd interface.To Reproduce
--force-profile
flag toPARSEC_PTGPP_FLAGS
inparsec/CMakeLists.txt
../dplasma/configure --with-hwloc --with-mpi --with-blas=Intel10_64lp_seq --disable-debug -DPARSEC_PROF_TRACE=ON --prefix=$PWD/install
./testing_dgemm_dtd -N 3000