geodynamics / aspect

A parallel, extensible finite element code to simulate convection in both 2D and 3D models.
https://aspect.geodynamics.org/
Other
224 stars 235 forks source link

Continental extension cookbook fails with gmg solver #6002

Open MFraters opened 1 month ago

MFraters commented 1 month ago

The continental extension cookbook currently fails for me with the output as shown below.

When I change the solver to AMG, the model runs without a problem.

I would suggest to switch this cookbook to AMG for now, untill we have found a fix. I will make a pull request for that.

Likely related #5736

-----------------------------------------------------------------------------
--                             This is ASPECT                              --
-- The Advanced Solver for Planetary Evolution, Convection, and Tectonics. --
-----------------------------------------------------------------------------
--     . version 2.6.0-pre (main, e8f3a9345)
--     . using deal.II 9.5.2
--     .       with 32 bit indices
--     .       with vectorization level 2 (AVX, 4 doubles, 256 bits)
--     . using Trilinos 14.4.0
--     . using p4est 2.3.2
--     . using Geodynamic World Builder 0.6.0
--     . running in OPTIMIZED mode
--     . running with 24 MPI processes
-----------------------------------------------------------------------------

-----------------------------------------------------------------------------
The output directory <iofiles/diking/continental_extension/> provided in the input file appears not to exist.
ASPECT will create it for you.
-----------------------------------------------------------------------------

-----------------------------------------------------------------------------
-- For information on how to cite ASPECT, see:
--   https://aspect.geodynamics.org/citing.html?ver=2.6.0-pre&NewtonSolver=1&mf=1&sha=e8f3a9345&src=code
-----------------------------------------------------------------------------
Number of active cells: 3,200 (on 4 levels)
Number of degrees of freedom: 107,649 (26,082+3,321+13,041+13,041+13,041+13,041+13,041+13,041)

Number of mesh deformation degrees of freedom: 6,642
   Solving mesh displacement system... 0 iterations.
*** Timestep 0:  t=0 years, dt=0 years
   Solving mesh displacement system... 0 iterations.
   Solving temperature system... 0 iterations.
   Skipping noninitial_plastic_strain composition solve because RHS is zero.
   Solving plastic_strain system ... 0 iterations.
   Solving crust_upper system ... 0 iterations.
   Solving crust_lower system ... 0 iterations.
   Solving mantle_lithosphere system ... 0 iterations.
   Initial Newton Stokes residual = 2.0674e+15, v = 2.06583e+15, p = 8.0446e+13

   Solving Stokes system... 0+33 iterations.
      Newton system information: Norm of the rhs: 2.0674e+15
      Relative nonlinear residual (Stokes system) after nonlinear iteration 1: 1

   Solving Stokes system... 0+65 iterations.
      Newton system information: Norm of the rhs: 9.60326e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 2: 0.0046451

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+13 iterations.
      Newton system information: Norm of the rhs: 3.81227e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 3: 0.00184399

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+10 iterations.
      Newton system information: Norm of the rhs: 2.06045e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 4: 0.000996638

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+12 iterations.
      Newton system information: Norm of the rhs: 1.33996e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 5: 0.000648137

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+12 iterations.
      Newton system information: Norm of the rhs: 1.40592e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 6: 0.000680042

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+8 iterations.
      Newton system information: Norm of the rhs: 1.72034e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 7: 0.000832126

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+7 iterations.
      Newton system information: Norm of the rhs: 1.77733e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 8: 0.000859696

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+12 iterations.
      Newton system information: Norm of the rhs: 2.17295e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 9: 0.00105106

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+7 iterations.
      Newton system information: Norm of the rhs: 2.12767e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 10: 0.00102916

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+11 iterations.
      Newton system information: Norm of the rhs: 1.93925e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 11: 0.000938015

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+12 iterations.
      Newton system information: Norm of the rhs: 3.28513e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 12: 0.00158902

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+15 iterations.
      Newton system information: Norm of the rhs: 1.94764e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 13: 0.000942071

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+28 iterations.
      Newton system information: Norm of the rhs: 2.25112e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 14: 0.00108887

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+37 iterations.
      Newton system information: Norm of the rhs: 6.86882e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 15: 0.00332245

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+28 iterations.
      Newton system information: Norm of the rhs: 6.95077e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 16: 0.00336209

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+43 iterations.
      Newton system information: Norm of the rhs: 8.23471e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 17: 0.00398313

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+39 iterations.
      Newton system information: Norm of the rhs: 8.78331e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 18: 0.00424849

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+42 iterations.
      Newton system information: Norm of the rhs: 8.36929e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 19: 0.00404822

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+59 iterations.
      Newton system information: Norm of the rhs: 8.9281e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 20: 0.00431852

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+56 iterations.
      Newton system information: Norm of the rhs: 8.17872e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 21: 0.00395604

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+73 iterations.
      Newton system information: Norm of the rhs: 1.56355e+13, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 22: 0.0075629

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+147 iterations.
      Newton system information: Norm of the rhs: 1.14504e+13, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 23: 0.00553856

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+86 iterations.
      Newton system information: Norm of the rhs: 7.19221e+12, Derivative scaling factor: 0
      Relative nonlinear residual (Stokes system) after nonlinear iteration 24: 0.00347887

   The linear solver tolerance is set to 0.01. 
   Solving Stokes system... 0+---------------------------------------------------------
TimerOutput objects finalize timed values printed to the
screen by communicating over MPI in their destructors.
Since an exception is currently uncaught, this
synchronization (and subsequent output) will be skipped
to avoid a possible deadlock.
---------------------------------------------------------

----------------------------------------------------
Exception 'ExcMessage (exception_message.str())' on rank 0 on processing: 

--------------------------------------------------------
An error occurred in line <3035> of file </home/.../source/utilities.cc> in function
    void aspect::Utilities::throw_linear_solver_failure_exception(const std::string&, const std::string&, const std::vector<dealii::SolverControl>&, const std::exception&, MPI_Comm, const std::string&)
The violated condition was: 
    false
Additional information: 
    The iterative Stokes solver in
    StokesMatrixFreeHandlerImplementation::solve did not converge.

    The initial residual was: 7.192212e+12
    The final residual is: 7.192212e+12
    The required residual for convergence is: 7.192212e+10
    See iofiles/diking/continental_extension/solver_history.txt for the
    full convergence history.

    The solver reported the following error:

    --------------------------------------------------------
    An error occurred in line <3035> of file
    </home.../source/utilities.cc> in
    function
    void aspect::Utilities::throw_linear_solver_failure_exception(const
    std::string&, const std::string&, const
    std::vector<dealii::SolverControl>&, const std::exception&, MPI_Comm,
    const std::string&)
    The violated condition was:
    false
    Additional information:
    The iterative (bottom right) solver in
    BlockSchurGMGPreconditioner::vmult did not converge.

    The initial residual was: 2.038551e-02
    The final residual is: 8.778290e-07
    The required residual for convergence is: 2.038551e-08

    The solver reported the following error:

    --------------------------------------------------------
    An error occurred in line <1337> of file
    </home/.../dealii-9.5.2/deal.II-v9.5.2/include/deal.II/lac/solver_cg.h>

    in function
    void dealii::SolverCG<VectorType>::solve(const MatrixType&,
    VectorType&, const VectorType&, const PreconditionerType&) [with
    MatrixType = aspect::MatrixFreeStokesOperators::MassMatrixOperator<2,
    1, double>; PreconditionerType = dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double>,
    dealii::MGTransferMatrixFree<2, double> >; VectorType =
    dealii::LinearAlgebra::distributed::Vector<double>]
    The violated condition was:
    solver_state == SolverControl::success
    Additional information:
    Iterative method reported convergence failure in step 100. The
    residual in the last step was 8.77829e-07.

    This error message can indicate that you have simply not allowed a
    sufficiently large number of iterations for your iterative solver to
    converge. This often happens when you increase the size of your
    problem. In such cases, the last residual will likely still be very
    small, and you can make the error go away by increasing the allowed
    number of iterations when setting up the SolverControl object that
    determines the maximal number of iterations you allow.

    The other situation where this error may occur is when your matrix is
    not invertible (e.g., your matrix has a null-space), or if you try to
    apply the wrong solver to a matrix (e.g., using CG for a matrix that
    is not symmetric or not positive definite). In these cases, the
    residual in the last iteration is likely going to be large.

    Stacktrace:
    -----------
    #0  ./aspect-release: void
    dealii::SolverCG<dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>
    >::solve<aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    >(aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1, double>
    const&, dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>&,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host> const&, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    const&)
    #1  ./aspect-release: ) [0x1ca6a63]
    #2  ./aspect-release: void
    dealii::SolverFGMRES<dealii::LinearAlgebra::distributed::BlockVector<double>

    >::solve<aspect::MatrixFreeStokesOperators::StokesOperator<2, 2,
    double>,
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,

    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    > >(aspect::MatrixFreeStokesOperators::StokesOperator<2, 2, double>
    const&, dealii::LinearAlgebra::distributed::BlockVector<double>&,
    dealii::LinearAlgebra::distributed::BlockVector<double> const&,
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,

    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    > const&)
    #3  ./aspect-release: aspect::StokesMatrixFreeHandlerImplementation<2,
    2>::solve()
    #4  ./aspect-release: aspect::Simulator<2>::solve_stokes()
    #5  ./aspect-release:
    aspect::Simulator<2>::do_one_defect_correction_Stokes_step(aspect::DefectCorrectionResiduals&,

    bool)
    #6  ./aspect-release:
    aspect::Simulator<2>::solve_single_advection_and_iterated_newton_stokes(bool)
#7

    ./aspect-release: aspect::Simulator<2>::solve_timestep()
    #8  ./aspect-release: aspect::Simulator<2>::run()
    #9  ./aspect-release: void
    run_simulator<2>(std::__cxx11::basic_string<char,
    std::char_traits<char>, std::allocator<char> > const&,
    std::__cxx11::basic_string<char, std::char_traits<char>,
    std::allocator<char> > const&, bool, bool, bool, bool)
    #10  ./aspect-release: main
    --------------------------------------------------------

    Stacktrace:
    -----------
    #0  ./aspect-release:
    aspect::Utilities::throw_linear_solver_failure_exception(std::__cxx11::basic_string<char,
    std::char_traits<char>, std::allocator<char> > const&,
    std::__cxx11::basic_string<char, std::char_traits<char>,
    std::allocator<char> > const&, std::vector<dealii::SolverControl,
    std::allocator<dealii::SolverControl> > const&, std::exception const&,
    ompi_communicator_t*, std::__cxx11::basic_string<char,
    std::char_traits<char>, std::allocator<char> > const&)
    #1  ./aspect-release: ) [0x1cb07da]
    #2  ./aspect-release: void
    dealii::SolverFGMRES<dealii::LinearAlgebra::distributed::BlockVector<double>
    >::solve<aspect::MatrixFreeStokesOperators::StokesOperator<2, 2,
    double>,
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,
    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    > >(aspect::MatrixFreeStokesOperators::StokesOperator<2, 2, double>
    const&, dealii::LinearAlgebra::distributed::BlockVector<double>&,
    dealii::LinearAlgebra::distributed::BlockVector<double> const&,
    aspect::internal::BlockSchurGMGPreconditioner<aspect::MatrixFreeStokesOperators::StokesOperator<2,
    2, double>, aspect::MatrixFreeStokesOperators::ABlockOperator<2, 2,
    double>, aspect::MatrixFreeStokesOperators::MassMatrixOperator<2, 1,
    double>, dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >,
    dealii::PreconditionMG<2,
    dealii::LinearAlgebra::distributed::Vector<double,
    dealii::MemorySpace::Host>, dealii::MGTransferMatrixFree<2, double> >
    > const&)
    #3  ./aspect-release: aspect::StokesMatrixFreeHandlerImplementation<2,
    2>::solve()
    #4  ./aspect-release: aspect::Simulator<2>::solve_stokes()
    #5  ./aspect-release:
    aspect::Simulator<2>::do_one_defect_correction_Stokes_step(aspect::DefectCorrectionResiduals&,
    bool)
    #6  ./aspect-release:
    aspect::Simulator<2>::solve_single_advection_and_iterated_newton_stokes(bool)
#7
    ./aspect-release: aspect::Simulator<2>::solve_timestep()
    #8  ./aspect-release: aspect::Simulator<2>::run()
    #9  ./aspect-release: void
    run_simulator<2>(std::__cxx11::basic_string<char,
    std::char_traits<char>, std::allocator<char> > const&,
    std::__cxx11::basic_string<char, std::char_traits<char>,
    std::allocator<char> > const&, bool, bool, bool, bool)
    #10  ./aspect-release: main
    --------------------------------------------------------

Stacktrace:
-----------
#0  ./aspect-release: aspect::Utilities::throw_linear_solver_failure_exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<dealii::SolverControl, std::allocator<dealii::SolverControl> > const&, std::exception const&, ompi_communicator_t*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#1  ./aspect-release: aspect::StokesMatrixFreeHandlerImplementation<2, 2>::solve()
#2  ./aspect-release: aspect::Simulator<2>::solve_stokes()
#3  ./aspect-release: aspect::Simulator<2>::do_one_defect_correction_Stokes_step(aspect::DefectCorrectionResiduals&, bool)
#4  ./aspect-release: aspect::Simulator<2>::solve_single_advection_and_iterated_newton_stokes(bool)
#5  ./aspect-release: aspect::Simulator<2>::solve_timestep()
#6  ./aspect-release: aspect::Simulator<2>::run()
#7  ./aspect-release: void run_simulator<2>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool, bool, bool)
#8  ./aspect-release: main
--------------------------------------------------------

Aborting!
----------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
  Proc: [[9510,1],0]
  Errorcode: 1

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
YiminJin commented 1 month ago

The continental extension experiment fails for me too.

I also tested the Newton solver and it fails for both AMG solver and GMG solver :-( It seems that there is something wrong in the Newton assemblers when elasticity is not taken into account.

YiminJin commented 1 month ago

Supplement to the previous comment: the Newton solver converges for both AMG and GMG methods when Use Newton residual scaling method is set to true. So maybe at the first time step the initial guess (0) is too far away from the yield surface, and we have to scale the Newton step to make the convergence stabler (the following time steps don't have the problem, because the initial guess current_linearization_point is not too far from the yield surface)?

On my machine, the Newton solver with AMG or GMG method is slower than the defect correction solver. So defect correction solver with AMG seems to be the best choice for this example.

anne-glerum commented 3 weeks ago

I've seen the same behaviour, in that Picard is much faster than Newton for this example.

naliboff commented 3 weeks ago

@MFraters - Thanks for posting this and yesterday a colleague and I found highly different behavior between AMG and GMG for a 3D continental extension problem with no elasticity.

@YiminJin @anne-glerum - Likewise, thanks for doing all of that testing.

Issue #4563 is still open and I wonder if we should just disable use of the GMG + Newton/defect correction + Visco Plastic material model for now? Are we confident from the existing test suite that all the other combination of solvers (AMG + Newton/defect correction, GMG + regular picard) are working?