libMesh / libmesh

libMesh github repository
http://libmesh.github.io
GNU Lesser General Public License v2.1
655 stars 286 forks source link

systems_of_equations_ex9 fails on DistributedMesh on 21 processors #1728

Open roystgnr opened 6 years ago

roystgnr commented 6 years ago

I ran across this when testing #1708, but discovered it affects master too. We end up calculating circularly dependent DoF constraint equations before we even begin to think about syncing them.

I can't reproduce the failure on 20 processors or less (or on 22 or 23). I haven't tried it yet on ReplicatedMesh.

IIRC this is the example @dknez wrote to test out rotational periodic boundary conditions, but PBCs on DistributedMesh are insanely tricky and have regressed in the past so it's possible that whatever's failing here could be triggered by normal translated periodicity too.

dknez commented 6 years ago

Yes, I made that example to test rotational periodic boundary conditions. I have only tried it with ReplicatedMesh.

roystgnr commented 6 years ago

The original code didn't work at all with DistributedMesh, and I thought I'd fixed that in 1f17a1e, but that commit already exhibits a failure at 21 processors. I guess I only tested it on 1 through 16? At least that means this isn't a recent regression.

The failure doesn't occur with ReplicatedMesh as far as I can tell.