We ran a simple hello world mpich program where each rank prints the rank id + hostname its running on.
The program allocates no memory at all, all of the memory allocation comes from whatever MPICH is doing.
We scaled the from 32 nodes to 768 nodes and measured how much memory is being consumed.
MPICH commit tag is 204f8cd
This is happening on Aurora
Memory Consumption is equivalent whether using DDR or HBM. Below data is measured on DDR.
We ran a simple hello world mpich program where each rank prints the rank id + hostname its running on. The program allocates no memory at all, all of the memory allocation comes from whatever MPICH is doing.
We scaled the from 32 nodes to 768 nodes and measured how much memory is being consumed. MPICH commit tag is 204f8cd This is happening on Aurora
Memory Consumption is equivalent whether using DDR or HBM. Below data is measured on DDR.
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
| Max DDR utilization (GB) -- | -- Node count | mpich/ opt/ develop-git.204f8cd | | 32 | 22.53 | | 64 | 24.58 | | 128 | 28.16 | | 256 | 35.33 | | 512 | 50.18 | | 768 | 68.10 | |