Petsc labeling issue in parallel

gthyagi commented 2 months ago

Hi @lmoresi

Here is the simple example that demonstrate petsc labelling issue in parallel.

import underworld3 as uw
from underworld3.systems import Stokes
import sympy

mesh = uw.meshing.StructuredQuadBox(elementRes=(2, 2), minCoords=(0., 0,),
                                    maxCoords=(1., 1.))

mesh.view()

v_soln = uw.discretisation.MeshVariable(r"u", mesh, 2, degree=2)
p_soln = uw.discretisation.MeshVariable(r"p", mesh, 1, degree=1, continuous=True)

# Create Stokes object
stokes = Stokes(mesh, velocityField=v_soln, pressureField=p_soln, solver_name="stokes")
stokes.constitutive_model = uw.constitutive_models.ViscousFlowModel
stokes.constitutive_model.Parameters.viscosity = 1.0
stokes.saddle_preconditioner = 1.0

# bc's
stokes.add_essential_bc(sympy.Matrix([1.0, 0.0]), "Top")
stokes.add_essential_bc(sympy.Matrix([sympy.oo, 0.0]), "Bottom")
stokes.add_essential_bc(sympy.Matrix([0.0,sympy.oo]), "Left")
stokes.add_essential_bc(sympy.Matrix([0.0,sympy.oo]), "Right")

stokes.bodyforce = sympy.Matrix([0, 0])

# stokes.petsc_options["pc_type"] = "lu"
stokes.tolerance = 1e-6
stokes.solve(verbose=False)

if uw.mpi.rank==0:
    print(p_soln.stats())

This works in serial but fails in parallel. mpirun -np 2 python Ex_Sbox_parallel.py

bknight1 commented 2 months ago

The output in serial from mesh.view() is:

| Boundary Name            | ID    | Min Size | Max Size |
| ------------------------------------------------------ |
| Bottom                   | 11    | 3        | 3        |
| Top                      | 12    | 3        | 3        |
| Right                    | 13    | 3        | 3        |
| Left                     | 14    | 3        | 3        |
| All_Boundaries           | 1001  | 8        | 8        |
| ------------------------------------------------------ |

and from mesh.dm.view():

Labels:
  depth: 3 strata with value/size (0 (9), 1 (12), 2 (4))
  All_Boundaries: 1 strata with value/size (1001 (8))
  Bottom: 1 strata with value/size (11 (3))
  Elements: 1 strata with value/size (99999 (5))
  Left: 1 strata with value/size (14 (3))
  Right: 1 strata with value/size (13 (3))
  Top: 1 strata with value/size (12 (3))
  celltype: 3 strata with value/size (0 (9), 1 (12), 4 (4))
  UW_Boundaries: 5 strata with value/size (11 (3), 12 (3), 13 (3), 14 (3), 1001 (8))

and parallel is:


| Boundary Name            | ID    | Min Size | Max Size |
| ------------------------------------------------------ |
| Bottom                   | 11    | 2        | 2        |
| Top                      | 12    | 2        | 2        |
| Right                    | 13    | 0        | 3        |
| Left                     | 14    | 0        | 3        |
| All_Boundaries           | 1001  | 4        | 4        |
| ------------------------------------------------------ |

Labels:
  depth: 3 strata with value/size (0 (6), 1 (7), 2 (2))
  All_Boundaries: 1 strata with value/size (1001 (4))
  Bottom: 1 strata with value/size (11 (2))
  Elements: 1 strata with value/size (99999 (3))
  Left: 0 strata with value/size ()
  Right: 1 strata with value/size (13 (3))
  Top: 1 strata with value/size (12 (2))
  celltype: 3 strata with value/size (0 (6), 1 (7), 4 (2))
  UW_Boundaries: 4 strata with value/size (11 (2), 12 (2), 13 (3), 1001 (4))

Seems like mesh.dm.view() is only reporting from the last processor, whilst mesh.view() is able to show that the left and right labels are missing from each processor (assuming the box is split down the middle, this makes sense).

I adding in:

for boundary in mesh.boundaries:
    proc = uw.mpi.rank
    # print(f'boundary name: {boundary.name}, label: {boundary.value}, proc: {proc}')
    l = mesh.dm.getLabel(boundary.name)
    if l:
        print(f'{boundary.name} found on {proc}' )
    else:
        print(f'{boundary.name} NOT found on {proc}' )

Which reports back:

Bottom found on 0
Top found on 0
Right found on 0
Left found on 0
All_Boundaries found on 0
Bottom found on 1
Top found on 1
Right found on 1
Left found on 1
All_Boundaries found on 1

which suggests the boundary labels are available on both processors?

The solve produces the following error:

[1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: General MPI error
[1]PETSC ERROR: MPI error 1 MPI_ERR_BUFFER: invalid buffer pointer

lmoresi commented 2 months ago

@gthyagi / @bknight1

It fails because the p_soln.stats() command is collective so you can't execute it just on process 0.

p_stats = p_soln.stats()

if uw.mpi.rank==0:
    print(p_stats)

... is the required syntax.

gthyagi commented 2 months ago

Thanks @lmoresi for pointing that.

Here is the above example with natural bc that does not work in parallel. However, elementRes=(2, 2)/(3, 3)/(4, 4) works.

import underworld3 as uw
from underworld3.systems import Stokes
import sympy

mesh = uw.meshing.StructuredQuadBox(elementRes=(5, 5), minCoords=(0., 0,),
                                    maxCoords=(1., 1.))

x,y = mesh.X

mesh.view()

v_soln = uw.discretisation.MeshVariable(r"u", mesh, 2, degree=2)
p_soln = uw.discretisation.MeshVariable(r"p", mesh, 1, degree=1, continuous=True)

# Create Stokes object
stokes = Stokes(mesh, velocityField=v_soln, pressureField=p_soln, solver_name="stokes")
stokes.constitutive_model = uw.constitutive_models.ViscousFlowModel
stokes.constitutive_model.Parameters.viscosity = 1.0
stokes.saddle_preconditioner = 1.0

# +
# bc's
t_init = sympy.cos(3*x*sympy.pi) * sympy.exp(-1000.0 * ((y - 0.99) ** 2)) 

# stokes.add_essential_bc(sympy.Matrix([1.0, 0.0]), "Top")
stokes.add_natural_bc(sympy.Matrix([0.0, -t_init]), "Top")
stokes.add_essential_bc(sympy.Matrix([sympy.oo, 0.0]), "Bottom")
stokes.add_essential_bc(sympy.Matrix([0.0,sympy.oo]), "Left")
stokes.add_essential_bc(sympy.Matrix([0.0,sympy.oo]), "Right")
# -

stokes.bodyforce = sympy.Matrix([0, 0])

print(f'rank: {uw.mpi.rank}, min coord: {mesh.data[:,0].min(), mesh.data[:,1].min()}', flush=True)
print(f'rank: {uw.mpi.rank}, max coord: {mesh.data[:,0].max(), mesh.data[:,1].max()}', flush=True)

# stokes.petsc_options["pc_type"] = "lu"
stokes.tolerance = 1e-6
stokes.solve(verbose=False)

p_stats = p_soln.stats()
if uw.mpi.rank==0:
    print(p_stats)

mpirun -np 2 python Ex_Sbox_parallel.py

lmoresi commented 2 months ago

@gthyagi - I've proposed a workaround on branch #241 and suggest that you review / try it. Also @bknight1 since JG is not available to review this week.

knepley commented 2 months ago

@bknight1 Yes, -dm_view only reports label output for proc 0. I did not want to make that collective, since all procs might not have all labels, but we could revisit this. There might be a clean way to do it now.

gthyagi commented 2 months ago

@knepley In -dm_view, it would be useful to print all the processor numbers that belong to each label.

bknight1 commented 2 months ago

@gthyagi we could do this in UW by modifying the mesh.view() to return the number of label values on each node

lmoresi commented 2 months ago

Make it optional because that could be a lot of information ...

bknight1 commented 2 months ago

I agree. I've made it a separate function for now, we can discuss tomorrow how to better present the information

gthyagi commented 2 months ago

@bknight1 The current print statement outputs a lot of information, and the label names are repetitive. It might be better to create a separate table with Label and List of Procs.

bknight1 commented 2 months ago

Yeah that probably is the best approach, feel free to make those changes

underworldcode / underworld3

Petsc labeling issue in parallel #240