The weak scaling test in benchmarks can now be run automatically (tested on Archer2), using ReFrame and Spack. The spack package is upstreamed to spack, the ReFrame test is currently in a branch of excalibur-tests that I'll try to merge soon.
Why are we not seeing much performance degradation from going from one full node to two or more full nodes? Performance actually improves from 256 cores (2 nodes) to 512 cores (4 nodes)
Why does thread scaling get so much worse going to 4 and 8 threads?
Why are we not seeing much performance degradation from filling up a node with MPI ranks?
The weak scaling test in benchmarks can now be run automatically (tested on Archer2), using ReFrame and Spack. The spack package is upstreamed to spack, the ReFrame test is currently in a branch of excalibur-tests that I'll try to merge soon.
To run the weak scaling test (on Archer2), follow the setup instructions on excalibur-tests, and launch the test with reframe using
Current result is below.
Questions raised by the results