Add fully-interior LGRs on a distributed grid

aritorto commented 3 months ago

This PR

refactors some methods related to refinement to take into account the case when the grid has been distributed (before refinement),
tests a few cases, adding LGRs after the grid has been distributed,
does not imply any updates on the reference manual.

Once merged, refinement of a distributed grid will be supported in the particular case of each LGR being fully interior on a process. Namely, every selected cell for refinement must be surrounding by other interior cells of the process.

Notice that each process can contain more than one LGR as long as each LGR is surrounding by other interior cells of the process.

In future PRs, this restriction will be weakened/removed.

aritorto commented 3 months ago

jenkins build this please

aritorto commented 3 months ago

jenkins build this please

aritorto commented 3 months ago

addLgrsDistGrid.log

I get different output. In any case, I set the PR as draft for now. Screenshot from 2024-08-08 07-55-49

aritorto commented 3 months ago

jenkins build this please

aritorto commented 2 months ago

jenkins build this please

aritorto commented 2 months ago

jenkins build this please

aritorto commented 2 months ago

Last commit inspired in suggestion on draft PR from @blattms

aritorto commented 2 months ago

jenkins build this please

aritorto commented 2 months ago

I'm marking this PR ready for review ,even though I need to work a bit more on the test file, which currently does not fail when you consider "one test case per run". (This explains why there are large commented pieces on the test). It seems that it can be improved selecting in which process each cell should be sent. In the meantime, I run jenkins too.

aritorto commented 2 months ago

jenkins build this please

aritorto commented 2 months ago

jenkins build this please

aritorto commented 2 months ago

Something might need to be fixed since running the all the tests defined in addLgrsOnDistrubutedGrid_test.cpp fail, but running each of them separately, it does not fail.

aritorto commented 2 months ago

jenkins build this please

akva2 commented 2 months ago

did not look at details but sounds like a fixture initializing mpi et al.

aritorto commented 2 months ago

thanks @akva2 I'm a bit new into mpi, so I'm not 100% about what you meant. In any case, I also noticed that, when requirements for refinements are not fulfilled, I should "throw in all the processes" (terminate) - not only one of them. I'll fix that soon. Does this happen to be related to your comment?

akva2 commented 2 months ago

not really, what i mean is if you have a test fixture which calls MPI_Init / MPI_Finalize (directly or indirectly), you are in trouble because those can only be called once per program invocation. So it would work fine running one test, but since the fixture is constructed / destructed once per test, you'd be in trouble running multiple tests.

aritorto commented 2 months ago

Thanks a lot for the explanation, so these last commits where I separated the text cases (one per file) in the end might make sense, if I got it correctly. I'll think about it! Thanks again @akva2

akva2 commented 2 months ago

you can see an example of the typical way of solving this in https://github.com/OPM/opm-simulators/blob/master/tests/test_broadcast.cpp

in particular note the use of BOOST_TEST_NO_MAIN and implementation of a main method.

blattms commented 2 months ago

@akva2 We actually found out what the problem was:

It seems like Zoltan has some kind of internal state (maybe a random number generator) and you cannot rely on the partitioning being the same the second time you call it (or if you previously computed another partitioning in addition).

I guess this might even be a feature: If you compute a partitioning and later realize that it was rubbish, you could start over and hope to get a better partitioning by chance. :sweat_smile:

Even for our small example computing another partitioning before has quite an impact:

Below I print the owned cell indices per rank.

With an additional previous partitioning call to zoltan: rank 0:0 1 4 5 8 9 16 20 21 rank 1:12 13 17 24 25 28 29 32 33 rank 2:2 3 6 7 10 11 18 22 23 rank 3:14 15 19 26 27 30 31 34 35

Without the previous partitioning: rank 0:14 15 18 19 26 27 30 31 35 rank 1:2 3 6 7 10 11 22 23 34 rank 2:12 13 16 17 24 25 28 29 33 rank 3:0 1 4 5 8 9 20 21 32

Hence we need to force our partitions in this case to make this stable.

akva2 commented 2 months ago

ah that old fun. there's a PHG_RANDOMIZE_INPUT parameter that might help. it could also be LB_APPROACH not set to PARTITION.

blattms commented 2 months ago

Cool, thanks. The result may still differ between different versions, etc.

aritorto commented 2 months ago

Thanks @akva2 and @blattms for your feedback. The PR is ready for review, after a jenkins check

aritorto commented 2 months ago

jenkins build this please

blattms commented 2 months ago

jenkins build this serial please

aritorto commented 2 months ago

jenkins build this serial please

aritorto commented 2 months ago

jenkins build this serial please

aritorto commented 2 months ago

jenkins build this serial please

aritorto commented 2 months ago

jenkins build this serial please

aritorto commented 2 months ago

jenkins build this serial please

blattms commented 2 months ago

Thanks a lot for all the work. I am ready to merge this. Should we maybe reduce the massive amount of commit by rewriting history, first?

aritorto commented 2 months ago

jenkins build this serial please

OPM / opm-grid

Add fully-interior LGRs on a distributed grid #748