jsitaraman / tioga

Tioga is a library for overset grid assembly on parallel distributed systems
GNU Lesser General Public License v3.0
64 stars 36 forks source link

DCI dependent on number of processes #46

Open mennodeij opened 3 years ago

mennodeij commented 3 years ago

I have created a relatively simple 2D test case of two overset grids. This case can be found on branch https://github.com/mennodeij/tioga/tree/grid2d_multi_process with the case in driver/grid2d_dist.F90

When I run this case (with the parameter 'grouped' on .TRUE.) for 1-20 processes, I get the domain connectivity information (DCI) outcome as displayed below in the animation. On top, the colors indicate which process has which cells, and the bottom half indicates the DCI using the IBLANK vertices array. With the parameter 'grouped' on .FALSE. the issue is also apparent.

Is this expected behavior? If not, am I doing something wrong using TIOGA? I use it such that each process has a part of both grids, maybe that is not correct?

Please note that I have made some changes to the file output routines in MeshBlock to facilitate comparing outcomes of different number of processes. The animation was done in ParaView 5.8.

numprocs

jsitaraman commented 3 years ago

There should be none or minimal changes with the number of processors on the connectivity pattern usually. Have you tried the TIOGA standard test of a 3-D problem of a sphere in a box that is executed through ./test.sh. You can change the argument to run.sh to any number of cores. If I recall correctly, this test showed no changes with partitioning. I think its because the sphere is a strand type grid and doesn't get partitioned in the wall normal direction. So all painting algorithms work fine.

In your case, I think the mexclude utilized in the foreground mesh is the problem. Since mexclude uses a painting strategy, its painting appears to differ depending on the partition boundary. This is somewhat expected because TIOGA doesn't paint across partition boundaries, i.e.it has only a local view of the data and doesn't do a global painting algorithm. Same holds true for reduce_fringes, if you are using it, since it again uses local painting to satisfy the fringe depth. i.e. I see that numprocs<=2 satisfies mexclude of 3, but after that it does not. Perhaps you can try with mexclude=0 and see what happens. That will create minimal overlap, but I would expect the pattern to not change.

It's an interesting find anyway. I have to look into fixing it more cleanly. Thank you.

On Tue, Oct 27, 2020 at 2:38 AM Menno Deij - van Rijswijk < notifications@github.com> wrote:

I have created a relatively simple 2D test case of two overset grids. This case can be found on branch https://github.com/mennodeij/tioga/tree/grid2d_multi_process

When I run this case (with the parameter 'grouped' on .TRUE.) for 1-20 processes, I get the domain connectivity information (DCI) outcome as displayed below in the animation. On top, the colors indicate which process has which cells, and the bottom half indicates the DCI using the IBLANK vertices array. With the parameter 'grouped' on .FALSE. the issue is also apparent.

Is this expected behavior? If not, am I doing something wrong using TIOGA? I use it such that each process has a part of both grids, maybe that is not correct?

Please note that I have made some changes to the file output routines in MeshBlock to facilitate comparing outcomes of different number of processes. The animation was done in ParaView 5.8.

[image: numprocs] https://user-images.githubusercontent.com/1713187/97282790-64ea1200-183f-11eb-946a-948142b4619e.gif

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jsitaraman/tioga/issues/46, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACH3YME7NB773FU457QQRZDSM2II5ANCNFSM4TARYFZA .

mennodeij commented 3 years ago

Thanks for explaining about the painting algorithms being local-only, it makes sense that there are differences depending on the number of processes. I suppose that for larger cases, where the number of cells per process is much larger than for these test cases, the differences become smaller, as they only affect the process boundaries.

I have tried my test case with mexclude set to zero, and there are indeed much less changes in the grid2d_dist test case.

I have also quickly looked at the test.sh case you mentioned: there are subtle differences with the sphere-in-cube test case. When I compare the cases with 4, 6, 9, 12 and 16 processes (that have the exact same number of cells in the background cube grid), the number of FRINGE cells is different in each case, on the order of 10-20 cells, with the grid having about 7,000 FRINGE cells. There are also similar changes in the vertices, i.e. between 6,400 and 6,420 vertices having FRINGE status.