LLNL / SAMRAI

Structured Adaptive Mesh Refinement Application Infrastructure - a scalable C++ framework for block-structured AMR application development
https://computing.llnl.gov/projects/samrai
Other
220 stars 80 forks source link

PersistentOverlapConnectors::findConnector is resorting to a global search to find overlaps #215

Open nicolasaunai opened 1 year ago

nicolasaunai commented 1 year ago

I regularly see this message as the last of my logs before simulations appear to be crashed or stalled :

samrai/source/SAMRAI/hier/PersistentOverlapConnectors.cpp line :473 message: PersistentOverlapConnectors::findConnector is resorting
to a global search to find overlaps between 0x4f6a600 and 0x4f6a600.

This relies on unscalable data or triggers unscalable operations.
Number of implicit global searches: 850

I only have this message for simulations running with the TileClustering, never with BergerRigoutsos.

any hint regarding what may be the source of this?

aslangil commented 12 months ago

@nicolasaunai did you find a solution for this? I am having the same issue.

nicolasaunai commented 12 months ago

@aslangil not really. I actually got it last week again, in my case it seems to occur only for simulations for which there are a few patches per processes. Do you see the error disappear if you use fewer processes OR larger domain?

aslangil commented 12 months ago

When I use a single node (less processor) or use multi node but just 2 AMR levels instead of 3 (kind of larger domain), it dissappears. So my experience is also leading to the similar conclusion with you.

nicolasaunai commented 11 months ago

I didn't really investigate, but my (limited) understanding is that finding a patch decomposition and its distribution across (too) few MPI processes could be too constrained to find a solution and not all corner cases may not be well covered... I only saw these crashes in "test runs" that are abnormally small.