Weak scaling instrumentation file

fdrmrc / Polydeal

C++ implementation of Polygonal Discontinuous Galerkin method within the deal.II Finite Element library.

https://fdrmrc.github.io/Polydeal/

Other

0 stars 0 forks source link

Weak scaling instrumentation file #125

Open fdrmrc opened 3 months ago

fdrmrc commented 3 months ago

This PR adds an instrumentation file to show the weak scaling. It solves an elliptic 3D problem with polynomial degree $p=2$ by increasing simultaneously:

the size of the underlying grid by a round of global refinement
the number of agglomerates by a factor of $8$ ($2^d$, $d=3$).

Runs with 256 processors on CINECA are as follows. Processors to the right of the red vertical line have more than $10^5$ DoFs, showing the classical trend shown in the "Distributed Computing paper" Algorithms and Data Structures for Massively Parallel Generic Finite Element Codes (fig. 10) in the case of standard shapes. @luca-heltai

weak_scaling

luca-heltai commented 3 months ago

This is very nice, but it is actually a strong scaling example... :D

To do a weak scaling plot, you have to increase also the number of processors when you increase the degrees of freedom, i.e., run a 8 times bigger problem on 8 times more processors. If you ran this same test on 8, 64, 512, and 4096 procs, and putting on the same plot all of the results (for each stage), you'd get both strong and weak scaling.

fdrmrc commented 3 months ago

Sure. Let me be more precise here: I was mimicking what was done in the paper (see the next figure on the left) by fixing the number of processors and varying the problem's size incrementally as done here. We already had a strong scaling example (such as the one the right, with a fixed problem's size), and I have not reported that in this PR.

fdrmrc commented 3 months ago

If you ran this same test on 8, 64, 512, and 4096 procs, and putting on the same plot all of the results (for each stage), you'd get both strong and weak scaling.

The classical weak scaling plot looks like this: (with 8,64, and 512 procs). All the new components (i.e. everything besides the AMG preconditioner) are scaling weakly

immagine