firedrakeproject / firedrake

Firedrake is an automated system for the portable solution of partial differential equations using the finite element method (FEM)
https://firedrakeproject.org
Other
516 stars 160 forks source link

TypeError in _map_cache when running in parallel on ARCHER #371

Closed ctjacobs closed 10 years ago

ctjacobs commented 10 years ago

The following code fails when run in parallel over 24 processes on ARCHER:

from pyop2 import *
from firedrake import *

mesh = Mesh("test.msh")

# Define the mixed function space
U = VectorFunctionSpace(mesh, "CG", 2)
H = FunctionSpace(mesh, "CG", 1)
W = MixedFunctionSpace([U, H])

# The solution field defined on the mixed function space
solution = Function(W)
u, h = split(solution)
w, v = TestFunctions(W)

n = FacetNormal(mesh)

# Define the compulsory shallow water fields
solution_old = Function(W).interpolate(Expression(("1e-16", "1e-16", "cos(pi/3e3*x[0])")))
u_old, h_old = split(solution_old)

# The solution should first hold the initial condition.
solution.assign(solution_old)

# Mean free surface height
h_mean = Function(W.sub(1)).interpolate(Expression("1"))

# Time-stepping parameters and constants
T = 100.0
t = 0
dt = 5.0

# The total height of the free surface.
h_total = h_mean + h

# The collection of all the individual terms in their weak form.
F = 0

# Mass term
M_momentum = (1.0/dt)*(inner(w, u) - inner(w, u_old))*dx
F += M_momentum

# Advection term
A_momentum = inner(dot(grad(u), u), w)*dx
F += A_momentum

# Stress tensor
viscosity = Constant(1)
K_momentum = -viscosity*inner(grad(u) + grad(u).T, grad(w))*dx
K_momentum += viscosity*(2.0/3.0)*inner(div(u)*Identity(2), grad(w))*dx
F -= K_momentum

# The gradient of the height of the free surface, h
C_momentum = -9.8*inner(w, grad(h))*dx
F -= C_momentum

# The mass term in the shallow water continuity equation 
M_continuity = (1.0/dt)*(inner(v, h) - inner(v, h_old))*dx
F += M_continuity

# Divergence term in the shallow water continuity equation
Ct_continuity = - h_total*inner(u, grad(v))*dx

# Known exterior values of velocity and the free surface perturbation for the Flather BC.
u_ext = Expression(("2.0", "0.0"), t=t)
h_ext = Expression("-cos((sqrt(9.81*50)*(t/3e3))*pi)", t=t)

weak_bc_expressions = []
weak_bc_expressions.append(u_ext)
weak_bc_expressions.append(h_ext)

# Apply Flather BC to boundary 4
Ct_continuity += h_total*inner(Function(W.sub(0)).interpolate(u_ext), n)*v*ds(4)
Ct_continuity += h_total*sqrt(Constant(9.8)/h_total)*(h - Function(W.sub(1)).interpolate(h_ext))*v*ds(4)

# Don't do anything with the ds term for boundary 3 - the DirichletBC for the free surface comes later.
Ct_continuity += h_total * inner(u, n) * v * ds(3)

F += Ct_continuity

# Strong Dirichlet BC for the free surface perturbation, h.
bc = DirichletBC(W.sub(1), Expression("cos((sqrt(9.81*50)*(t/3e3))*pi)", t=t), (3))

# Construct the solver objects
problem = NonlinearVariationalProblem(F, solution, bcs=[bc])
solver = NonlinearVariationalSolver(problem, solver_parameters={'ksp_monitor': True, 
                                                            'ksp_view': False, 
                                                            'pc_view': False, 
                                                            'pc_type': 'fieldsplit',
                                                            'pc_fieldsplit_type': 'schur',
                                                            'ksp_type': 'gmres',
                                                            'pc_fieldsplit_schur_fact_type': 'FULL',
                                                            'fieldsplit_0_ksp_type': 'preonly',
                                                            'fieldsplit_1_ksp_type': 'preonly',
                                                            'ksp_rtol': 1.0e-7,
                                                            'snes_type':'ksponly'}) # Just doing one non-linear iteration here.

t += dt

# The time-stepping loop
while t <= T:
   print "\nt = %g" % t

   for expr in weak_bc_expressions:
      expr.t = t

   # Solve the system of equations!
   solver.solve()

   # Move to next time step    
   solution_old.assign(solution)    
   t += dt      

Error file:

numpy/1.8.0(8):ERROR:150: Module 'numpy/1.8.0' conflicts with the currently loaded module(s) 'anaconda/1.9.2'
numpy/1.8.0(8):ERROR:102: Tcl command execution failed: conflict anaconda

Tue Sep 23 09:17:06 2014: [PE_11]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000100000000000
Tue Sep 23 09:17:06 2014: [PE_8]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000100000000
Tue Sep 23 09:17:06 2014: [PE_10]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000010000000000
Tue Sep 23 09:17:06 2014: [PE_19]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000010000000000000000000
Tue Sep 23 09:17:06 2014: [PE_1]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000000000010
Tue Sep 23 09:17:06 2014: [PE_13]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000010000000000000
Tue Sep 23 09:17:06 2014: [PE_15]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000001000000000000000
Tue Sep 23 09:17:06 2014: [PE_21]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000001000000000000000000000
Tue Sep 23 09:17:06 2014: [PE_6]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000001000000
Tue Sep 23 09:17:06 2014: [PE_7]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000010000000
Tue Sep 23 09:17:06 2014: [PE_22]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000010000000000000000000000
Tue Sep 23 09:17:06 2014: [PE_9]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000001000000000
Tue Sep 23 09:17:06 2014: [PE_23]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000100000000000000000000000
Tue Sep 23 09:17:06 2014: [PE_14]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000100000000000000
Tue Sep 23 09:17:06 2014: [PE_20]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000100000000000000000000
Tue Sep 23 09:17:06 2014: [PE_17]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000100000000000000000
Tue Sep 23 09:17:06 2014: [PE_18]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000001000000000000000000
Tue Sep 23 09:17:06 2014: [PE_12]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000001000000000000
Tue Sep 23 09:17:06 2014: [PE_3]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000000001000
Tue Sep 23 09:17:06 2014: [PE_16]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000010000000000000000
Tue Sep 23 09:17:06 2014: [PE_5]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000000100000
Tue Sep 23 09:17:06 2014: [PE_4]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000000010000
Tue Sep 23 09:17:06 2014: [PE_0]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000000000001
Tue Sep 23 09:17:06 2014: [PE_2]: cpumask set to 1 cpu on nid02880, cpumask = 000000000000000000000000000000000000000000000100
Traceback (most recent call last):
  File "/work/n01/n01/ctjacobs//firedrake-fluids/tests/test/test_archer.py", line 108, in <module>
Traceback (most recent call last):
  File "/work/n01/n01/ctjacobs//firedrake-fluids/tests/test/test_archer.py", line 108, in <module>
Traceback (most recent call last):
  File "/work/n01/n01/ctjacobs//firedrake-fluids/tests/test/test_archer.py", line 108, in <module>
    solver.solve()
  File "<string>", line 2, in solve
    solver.solve()
  File "<string>", line 2, in solve
    solver.solve()
  File "<string>", line 2, in solve
  File "/work/y07/y07/fdrake/PyOP2/lib/python2.7/site-packages/PyOP2-0.11.0_70_gf4d182d_dirty-py2.7-linux-x86_64.egg/pyop2/profiling.py", line 197, in wrapper
  File "/work/y07/y07/fdrake/PyOP2/lib/python2.7/site-packages/PyOP2-0.11.0_70_gf4d182d_dirty-py2.7-linux-x86_64.egg/pyop2/profiling.py", line 197, in wrapper
  File "/work/y07/y07/fdrake/PyOP2/lib/python2.7/site-packages/PyOP2-0.11.0_70_gf4d182d_dirty-py2.7-linux-x86_64.egg/pyop2/profiling.py", line 197, in wrapper
    return f(*args, **kwargs)
    return f(*args, **kwargs)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 306, in solve
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 306, in solve
    return f(*args, **kwargs)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 306, in solve
        self.snes.solve(None, v)
self.snes.solve(None, v)
    self.snes.solve(None, v)
  File "SNES.pyx", line 418, in petsc4py.PETSc.SNES.solve (src/petsc4py.PETSc.c:149863)
  File "SNES.pyx", line 418, in petsc4py.PETSc.SNES.solve (src/petsc4py.PETSc.c:149863)
  File "SNES.pyx", line 418, in petsc4py.PETSc.SNES.solve (src/petsc4py.PETSc.c:149863)
  File "petscsnes.pxi", line 265, in petsc4py.PETSc.SNES_Jacobian (src/petsc4py.PETSc.c:30367)
  File "petscsnes.pxi", line 265, in petsc4py.PETSc.SNES_Jacobian (src/petsc4py.PETSc.c:30367)
  File "petscsnes.pxi", line 265, in petsc4py.PETSc.SNES_Jacobian (src/petsc4py.PETSc.c:30367)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 244, in form_jacobian
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 244, in form_jacobian
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 244, in form_jacobian
    self._jac_tensor.M._force_evaluation()
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/matrix.py", line 143, in M
    self._jac_tensor.M._force_evaluation()
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/matrix.py", line 143, in M
    self._jac_tensor.M._force_evaluation()
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/matrix.py", line 143, in M
    self.assemble()
          File "/work/n01/n01/ctjacobs/firedrake/firedrake/matrix.py", line 72, in assemble
self.assemble()
self.assemble()
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/matrix.py", line 72, in assemble
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/matrix.py", line 72, in assemble
    self._assembly_callback(self.bcs)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/assembly_cache.py", line 346, in inner
    self._assembly_callback(self.bcs)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/assembly_cache.py", line 346, in inner
    self._assembly_callback(self.bcs)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/assembly_cache.py", line 346, in inner
        r = thunk(bcs)
r = thunk(bcs)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 599, in thunk
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 599, in thunk
    r = thunk(bcs)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 599, in thunk
    i, j)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 513, in mat
    i, j)
    (testmap(test.function_space()[i])[op2.i[0]],
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 597, in <lambda>
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 513, in mat
    (testmap(test.function_space()[i])[op2.i[0]],
    i, j)
    tensor_arg = mat(lambda s: s.exterior_facet_node_map(tsbc),
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 597, in <lambda>
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 513, in mat
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/functionspace.py", line 363, in exterior_facet_node_map
    tensor_arg = mat(lambda s: s.exterior_facet_node_map(tsbc),
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/functionspace.py", line 363, in exterior_facet_node_map
    (testmap(test.function_space()[i])[op2.i[0]],
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/solving.py", line 597, in <lambda>
    tensor_arg = mat(lambda s: s.exterior_facet_node_map(tsbc),
    offset=offset)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/functionspace.py", line 363, in exterior_facet_node_map
    offset=offset)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/functionspace.py", line 423, in _map_cache
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/functionspace.py", line 423, in _map_cache
    offset=offset)
  File "/work/n01/n01/ctjacobs/firedrake/firedrake/functionspace.py", line 423, in _map_cache
    new_entity_node_list = node_list_bc.take(entity_node_list)
    new_entity_node_list = node_list_bc.take(entity_node_list)
TypeError: long() argument must be a string or a number, not 'NoneType'
TypeError: long() argument must be a string or a number, not 'NoneType'
Rank 21 [Tue Sep 23 09:17:37 2014] [c7-1c0s0n0] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 21
Rank 15 [Tue Sep 23 09:17:37 2014] [c7-1c0s0n0] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 15
    new_entity_node_list = node_list_bc.take(entity_node_list)
TypeError: long() argument must be a string or a number, not 'NoneType'
Rank 8 [Tue Sep 23 09:17:37 2014] [c7-1c0s0n0] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 8
_pmiu_daemon(SIGCHLD): [NID 02880] [c7-1c0s0n0] [Tue Sep 23 09:17:37 2014] PE RANK 15 exit signal Aborted
[NID 02880] 2014-09-23 09:17:37 Apid 10147572: initiated application termination

It gets past this stage and onto the solver iterations if the DirichletBC is not applied. I've placed the test.msh file in /scratch/ctj10, but the same error occurs with e.g. mesh = UnitSquareMesh(100, 100).

kynan commented 10 years ago

I can reproduce the failure on a 32x32 unit square with 12 and 24 cores. Not sure what's going on though, seems the exterior_facet_node_list is None.

@ctjacobs FYI ARCHER has shared folders too

wence- commented 10 years ago

I think I know the problem, I'll have a look.

wence- commented 10 years ago

373 should fix this, can you confirm please?

ctjacobs commented 10 years ago

This has indeed fixed the issue. Thanks.