lower d Lagrange variable dofs may not get set with distributed meshes

lindsayad commented 2 years ago

If the node that owns the lower-d Lagrange variable dof is owned by a process that does not have any lower-dimensional elements connected to the node, then we will not set the dof

lindsayad commented 2 years ago

@roystgnr does this description make sense to you?

roystgnr commented 2 years ago

It wasn't just that the dof_number() wasn't getting set, though, right? n_comp() was returning zero on certain nodes?

And I'm still working on understanding the problem. When the problem was subdomain-restricted variables we could hit the case where a processor didn't own any of the elements in subdomain S, but did own a node on one of the elements, and so had to manually loop over nodes and set the DoF indices for any such node anyway. The issue here is that we look at n_comp() (via n_comp_group() to do so, and so it breaks when we're not even setting n_comp() right?

When you say "lower-d Lagrange variable", do you imply that it's also subdomain-restricted to the lower-d elements there, so we don't have that variable defined on any interior parents?

And is the problem then that, not only are we not ghosting point neighbors which are lower- and higher-d elements by default, we're not even ghosting the lower-d side elements of an interior_parent by default? So the interior_parent's processor owns its nodes, but since it doesn't ghost the side element, there's nothing to set n_comp_group() on those nodes for a variable restricted to lower-d elements?

If I'm understanding that right, it sounds like the right long-term fix is for libMesh to always ghost side elements of an interior_parent. GhostPointNeighbors is ghosting interior_parents of a side element, but that's not enough unless we do it in both directions, huh.

lindsayad commented 2 years ago

When you say "lower-d Lagrange variable", do you imply that it's also subdomain-restricted to the lower-d elements there, so we don't have that variable defined on any interior parents?

Correct

It wasn't just that the dof_number() wasn't getting set, though, right? n_comp() was returning zero on certain nodes?

Correct

not only are we not ghosting point neighbors which are lower- and higher-d elements by default, we're not even ghosting the lower-d side elements of an interior_parent by default?

Correct. Our current ghosting of lower-d/higher-d "pairs" is only one-way: only the lower-d element has any awareness of the higher-d element, through the interior_parent API.

However, strictly speaking, just ghosting sides may not be enough. You could imagine that a node is owned by a process and there are no local higher-dimensional elements whose sides have lower-dimensional elements. This could happen at a TRI3 mesh corner for example. Yet a lower-dimensional element on another process may have connectivity with that node. In such a circumstance if we have only geometrically ghosted sides of local higher-dimensional elements, and not point lower-dimensional neighbors, then n_comp will remain 0 for the lower-dimensional subdomain restricted Lagrange variable.

roystgnr commented 2 years ago

That certainly sounds like an issue waiting to happen ... and I don't actually see anything about "lower d vs higher d" that's important to the issue at that point, just "subdomain-restricted" ... this is probably why I vaguely recall running into a few failures when I tried to drop our default ghosting from point-neighbors back to side-neighbors. We're going to have to add a push_parallel_vector_data in some place analogous to DofMap::set_nonlocal_dof_objects, aren't we? The pull_ there lets us query ghost nodes where we don't know their ids, but it doesn't work for local nodes where we don't even know we don't know all their ids. Only the processor with the subdomain-restricted variable has a chance to realize there's a problem.

That sounds expensive thanks to all the stuff that would be pushed needlessly, too. Damn. Maybe ghosting lower-<->higher-d point neighbors actually is the long-term solution here.

lindsayad commented 2 years ago

That certainly sounds like an issue waiting to happen

Well it did already happen haha.

That sounds expensive thanks to all the stuff that would be pushed needlessly, too. Damn. Maybe ghosting lower-<->higher-d point neighbors actually is the long-term solution here.

Sounds like maybe that is the move. I have no problem with that.

libMesh / libmesh

lower d Lagrange variable dofs may not get set with distributed meshes #3369