starpu-runtime / starpu

This is a mirror of https://gitlab.inria.fr/starpu/starpu where our development happens, but contributions are welcome here too!
https://starpu.gitlabpages.inria.fr/
GNU Lesser General Public License v2.1
63 stars 12 forks source link

Scalability of StarPU-MPI for LU decomposition #19

Closed WwwwwYyyy closed 1 year ago

WwwwwYyyy commented 1 year ago

Hello Professor: Here are the commands that we used: export OMP_NUM_THREADS=1 export OPENBLAS_NUM_THREADS=1 export STARPU_NCPU=32 (The first three commands are to lower the number of cores for each process of MPI) export STARPU_WORKERS_GETBIND=1 mpirun –bind-to socket -n 2 ./plu_example_double 8 -size 16384 -nblocks 16 -p 2 -q 1

However, it seems that the scalability still didn’t enhance neither laptap nor on a dual-node machine. The program that run with 2 or 4 process is still slower than 1 process. Is there anything wrong with the commands?Looking forward to your reply

nfurmento commented 1 year ago

Is that related to issue #14 ? If yes, please close this new issue and copy your comments in the previous issue.

And also, have you read https://files.inria.fr/starpu/doc/html/OfflinePerformanceTools.html#Off-linePerformanceFeedback to find out how to analyze the performance of a program ?