SCOREC / core

parallel finite element unstructured meshes
Other
181 stars 63 forks source link

Empty parts #198

Open KennethEJansen opened 5 years ago

KennethEJansen commented 5 years ago

APF warning: 3 empty parts

This issue has always puzzled me. Is SCOREC/core designed to function properly when parts get emptied? In my experience this warning always leads to a crash somewhere. If not in SCOREC/core then in the application since I don't know of many applications that are designed to work properly in this pathological limit.

I guess I am wondering how hard it is to put a check in code in the locations that are migrating or collapsing that determines if the operation empties the part. If it does, don't do it. In the current case it looks like this is happening in coarsening. I understand the suggestion in this case is to raise adaptShrunken. I did this (raising from 20000 to 40000) and this worked but it also raised the coarsening time from 452 to 652 seconds because the adapt was happening on 2k procs instead of 4k procs. This is a 33% slow down and perhaps not the best way to get around the issue. It is somewhat surprising that 20k elements per part is not heavy enough to dodge this problem (e.g. a part comes into coarsening with 20k elements and they all get coarsened away in the first iteration of adapt??? AFAIK the algorithm is supposed to coarsen by 2 per iteration so this seems impossible).

I know at one time we had an algorithm called heavyPartSplit to recover from empty parts. As the name implies, it detected light parts and then migrated half of the fattest part onto the light part. Of course this had the danger of fragmenting the parts but a remedy to that would be to completely empty the light part by migrating its small contents to adjacent parts before splitting the heavy one.

cwsmith commented 5 years ago

This issue has always puzzled me. Is SCOREC/core designed to function properly when parts get emptied? No. Most functionality in core that uses APF will not handle empty parts. This warning is typically followed by a crash.

Meshadapt uses migration to form cavities around entities that are classified on the part boundary and need to be modified (except for split, IIUC). This migration, coupled with the coarsening, caused parts to be emptied.

I'd prefer to address the issue from the meshadapt side by adding some migrations that avoid the draining than modifying the general migration procedure to do something other than what was requested.