Open abouteiller opened 9 months ago
Results still incorrect after #133 😠
PMIX_MCA_psec='' SLURM_TIMELIMIT=1 PARSEC_MCA_device_cuda_enabled=1 PARSEC_MCA_device_cuda_memory_use=10 OMPI_MCA_rmaps_base_oversubscribe=true salloc -wleconte -n 8 --gpus-per-task=1 /usr/bin/srun "-n" "4" "tests/testing_spotrf_dtd" "-N" "378" "-t" "19" "-x" "-v=5"
Describe the bug
MPI POTRF DTD with 1 GPU produces wrong results. The 1-node variant is correct.
Not clear ATM if issue is in the DTD testers for GEMM and POTRF, or in PaRSEC.
Important note
After #114 this error will not manifest in normal ctest/CI (because test is forced to run on CPU only), but can still be reproduced by hand. The fix PR should add a specific test for DTD+GPU to explicitly test for this case.
Buggy output
Setup