code-saturne / code_saturne

code_saturne public mirror
https://www.code-saturne.org
GNU General Public License v2.0
223 stars 82 forks source link

Parallel partitioning of small case fails with PT-Scotch #6

Closed mathrack closed 6 years ago

mathrack commented 6 years ago

Dear Users, Developers,

Partitioning of a small case (64 cells) fails in parallel when the number of processors is too high (works on 10 CPUS but does not on 20). I suspect the issue is related to PT-Scotch (6.0.4 used here) but I can not verify it as I do not have any other parallel partitioning tool at hand.

Code_Saturne rev. 11067

case_8x8.tar.gz

YvanFournier commented 6 years ago

Hello,

You can always use the"built-in" partitioning, using space-filling curves (check in "Calculation management/Performance tuning/Partitioning" in the GUI).

Using PT-Scotch on your case, I also have the crash at 20 ranks. Using a rank step of 2, partitioning with PT-Scotch also works (i.e. PT-Scotch builds a partition for 20 ranks, but it only runs on 1 rank out pf 2, so 10 ranks, which probably allows it to work with larger local graph subsets.

ParMetis usually is not quite as good at load balancing, and fails for this case on 20 ranks en using rank steps of 2 to 20 (i.e. partitioning on a single rank also fails), while it also works for 10 ranks.

The built-in partitioner (I tried Morton) works all the way up to 1 cell/rank (i.e. even for 64 MPI ranks).

This behavior has already been observed, so a safer default might be to switch to an SFC (space-filling curve, Morton or Hilbert here) partitioning when the cells/ranks ratio becomes too small, but since this type of ratio is useful only for debugging, not performance in any case, it would probably make testing more tricky.

So I guess I'll simply close this bug.