idaholab / moose

Multiphysics Object Oriented Simulation Environment
https://www.mooseframework.org
GNU Lesser General Public License v2.1
1.76k stars 1.05k forks source link

kernels/scalar_constraint not thread safe #6511

Open friedmud opened 8 years ago

friedmud commented 8 years ago

Description of the enhancement or error report

As discussed in #6436 kernels/scalar_constraint MOOSE test will segfault when run with many threads. Here is the output of running in dbg mode:

Assertion `col_dofs.size() == matrix.n()' failed.
col_dofs.size() = 1
matrix.n() = 9

Stack frames: 26
0: libMesh::print_trace(std::ostream&)
1: libMesh::MacroFunctions::report_error(char const*, int, char const*, char const*)
2: libMesh::DofMap::constrain_element_matrix(libMesh::DenseMatrix<double>&, std::__debug::vector<unsigned int, std::allocator<unsigned int> >&, std::__debug::vector<unsigned int, std::allocator<unsigned int> >&, bool) const
3: Assembly::addJacobianBlock(libMesh::SparseMatrix<double>&, libMesh::DenseMatrix<double>&, std::__debug::vector<unsigned int, std::allocator<unsigned int> > const&, std::__debug::vector<unsigned int, std::allocator<unsigned int> > const&, double)
4: Assembly::addJacobianOffDiagScalar(libMesh::SparseMatrix<double>&, unsigned int)
5: FEProblem::addJacobianOffDiagScalar(libMesh::SparseMatrix<double>&, unsigned int, unsigned int)
6: NonlinearSystem::computeScalarKernelsJacobians(libMesh::SparseMatrix<double>&)
7: NonlinearSystem::computeJacobianInternal(libMesh::SparseMatrix<double>&)
8: NonlinearSystem::computeJacobian(libMesh::SparseMatrix<double>&)
9: FEProblem::computeJacobian(libMesh::NonlinearImplicitSystem&, libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&)
10: Moose::compute_jacobian(libMesh::NumericVector<double> const&, libMesh::SparseMatrix<double>&, libMesh::NonlinearImplicitSystem&)
11: __libmesh_petsc_snes_jacobian
12: SNESComputeJacobian
13: SNESSolve_NEWTONLS
14: SNESSolve
15: libMesh::PetscNonlinearSolver<double>::solve(libMesh::SparseMatrix<double>&, libMesh::NumericVector<double>&, libMesh::NumericVector<double>&, double, unsigned int)
16: libMesh::NonlinearImplicitSystem::solve()
17: TimeIntegrator::solve()
18: NonlinearSystem::solve()
19: FEProblem::solve()
20: Steady::execute()
21: MooseApp::executeExecutioner()
22: MooseApp::run()
23: ../../../moose_test-dbg() [0x4139c0]
24: __libc_start_main
25: ../../../moose_test-dbg() [0x413609]
[0] ../src/base/dof_map_constraints.C, line 1734, compiled Mar  1 2016 at 19:34:54
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

Rationale for the enhancement or information for reproducing the error

I can reproduce it easily on hpcbuild with TBB with 6 threads.

andrsd commented 8 years ago

Interesting. But, scalar kernels are not evaluated in a threaded loop.

friedmud commented 8 years ago

But it could be a side effect... like the state/size of the DenseMatrix is getting changed in the scalar kernel constraints code... and then it's tripping us up further down the line.

andrsd commented 8 years ago

But it could be a side effect...

It most likely is. One place that is touching these DenseMatrices for off-diagonal blocks of scalar variables is when we are handling the integrated boundary condition. Kernels, may be, too...