Closed dmargala closed 3 years ago
It matches desi_proc, but is that necessarily optimal for gpu_specter? desi_proc/specter had some constraints of 1 rank = 1 bundle for historical reasons, which gpu_specter has the opportunity to differently if another split is more optimal. Or was the cpu path using a split that was optimal for GPU but sub-optimal for CPU, and making it match desi_spec improves the cpu path?
i.e. I'm fine with this PR if it makes things better for the cpu-path, but we shouldn't feel constrained to use the exact same rank bundling as desi_spec/specter if there is something better, e.g. to give more flexibility for packing work onto a node.
Sorry for the confusion, this PR only updates the comm splitting logic when using specter
, the comm splitting logic for cpu and gpu code paths of gpu_specter
remains the same.
Before this PR, running the benchmark script with specter
would use the same comm splitting logic as gpu_specter
.
Got it, thanks for the clarification. Merging now.
This PR updates the MPI comm splitting logic when using cpu specter in the benchmark script to match desi_proc.
I also moved a noisy log message to
debug
instead ofinfo
and "fixed" a deprecation warning withnp.int
->int
.