Open fadhli-atarita opened 2 years ago
Your ASPECT version seems to be quite old. Can you retry with the current development version, please? Is this the output without any changes to the .prm file?
There was a segmentation fault issue on the clusters with v2.3.0 so the admin suggested that I use v2.2.0 for now. This output was produced with changes on the 3 parameters that I mentioned earlier. Without any changes it would still have similar issues but at a different timestep.
@zjiaqi2018 Can you try cookbooks/initial-condition-S20RTS/S20RTS.prm with the current ASPECT master and run it for a while to check if it crashes for you as well?
Seems to be fine:
*** Timestep 121: t=2.36743e+07 years, dt=155599 years
Solving temperature system... 23 iterations.
Rebuilding Stokes preconditioner...
Solving Stokes system... 92+0 iterations.
Postprocessing:
RMS, max velocity: 0.261 m/year, 1.15 m/year
Writing heat flux map: output-S20RTS/heat_flux.00121
Heat fluxes through boundary parts: -1.219e+14 W, 7.079e+14 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: 8.67e-07 Pa, 9.576e+10 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.00121
Writing graphical output: output-S20RTS/solution/solution-00121
update: It does fail if it's long enough.
*** Timestep 1217: t=1.19037e+08 years, dt=52.2907 years
Solving temperature system... retrying linear solve with different preconditioner...
---------------------------------------------------------
TimerOutput objects finalize timed values printed to the
screen by communicating over MPI in their destructors.
Since an exception is currently uncaught, this
synchronization (and subsequent output) will be skipped
to avoid a possible deadlock.
---------------------------------------------------------
----------------------------------------------------
Exception 'ExcMessage (exception_message.str())' on rank 0 on processing:
--------------------------------------------------------
An error occurred in line <2778> of file </home/jiaqi2/aspect/aspect/source/utilities.cc> in function
void aspect::Utilities::linear_solver_failed(const string&, const string&, const std::vector<dealii::SolverControl>&, const std::exception&, const MPI_Comm&, const string&)
The violated condition was:
false
Additional information:
The iterative advection solver in Simulator::solve_advection did not
converge.
The initial residual was: 3.385886e+30
The final residual is: 3.385886e+30
The required residual for convergence is: 3.153455e+20
See output-S20RTS/solver_history.txt for the full convergence history.
The solver reported the following error:
--------------------------------------------------------
An error occurred in line <1075> of file
</home/jiaqi2/dealii/installed/include/deal.II/lac/solver_gmres.h> in
function
void dealii::SolverGMRES<VectorType>::solve(const MatrixType&,
VectorType&, const VectorType&, const PreconditionerType&) [with
MatrixType = dealii::TrilinosWrappers::SparseMatrix;
PreconditionerType = dealii::TrilinosWrappers::PreconditionILU;
VectorType = dealii::TrilinosWrappers::MPI::Vector]
The violated condition was:
iteration_state == SolverControl::success
Additional information:
Iterative method reported convergence failure in step 1000. The
residual in the last step was 3.38589e+30.
This error message can indicate that you have simply not allowed a
sufficiently large number of iterations for your iterative solver to
converge. This often happens when you increase the size of your
problem. In such cases, the last residual will likely still be very
small, and you can make the error go away by increasing the allowed
number of iterations when setting up the SolverControl object that
determines the maximal number of iterations you allow.
The other situation where this error may occur is when your matrix is
not invertible (e.g., your matrix has a null-space), or if you try to
apply the wrong solver to a matrix (e.g., using CG for a matrix that
is not symmetric or not positive definite). In these cases, the
residual in the last iteration is likely going to be large.
Stacktrace:
-----------
#0 ./../../build/aspect: void
dealii::SolverGMRES<dealii::TrilinosWrappers::MPI::Vector>::solve<dealii::TrilinosWrappers::SparseMatrix,
dealii::TrilinosWrappers::PreconditionILU>(dealii::TrilinosWrappers::SparseMatrix
const&, dealii::TrilinosWrappers::MPI::Vector&,
dealii::TrilinosWrappers::MPI::Vector const&,
dealii::TrilinosWrappers::PreconditionILU const&)
#1 ./../../build/aspect:
aspect::Simulator<3>::solve_advection(aspect::Simulator<3>::AdvectionField
const&)
#2 ./../../build/aspect:
aspect::Simulator<3>::assemble_and_solve_temperature(bool, double*)
#3 ./../../build/aspect:
aspect::Simulator<3>::solve_single_advection_single_stokes()
#4 ./../../build/aspect: aspect::Simulator<3>::solve_timestep()
#5 ./../../build/aspect: aspect::Simulator<3>::run()
#6 ./../../build/aspect: void
run_simulator<3>(std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, bool, bool, bool)
#7 ./../../build/aspect: main
--------------------------------------------------------
Stacktrace:
-----------
#0 ./../../build/aspect: aspect::Utilities::linear_solver_failed(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<dealii::SolverControl, std::allocator<dealii::SolverControl> > const&, std::exception const&, int const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#1 ./../../build/aspect: aspect::Simulator<3>::solve_advection(aspect::Simulator<3>::AdvectionField const&)
#2 ./../../build/aspect: aspect::Simulator<3>::assemble_and_solve_temperature(bool, double*)
#3 ./../../build/aspect: aspect::Simulator<3>::solve_single_advection_single_stokes()
#4 ./../../build/aspect: aspect::Simulator<3>::solve_timestep()
#5 ./../../build/aspect: aspect::Simulator<3>::run()
#6 ./../../build/aspect: void run_simulator<3>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool, bool)
#7 ./../../build/aspect: main
--------------------------------------------------------
Aborting!
----------------------------------------------------``
Can you share log from the last time step? We might need to look at the last graphical output as well.
*** Timestep 1216: t=1.19037e+08 years, dt=282.348 years
Solving temperature system... 48 iterations.
Solving Stokes system... 53+0 iterations.
Postprocessing:
RMS, max velocity: 36.3 m/year, 2.71e+03 m/year
Writing heat flux map: output-S20RTS/heat_flux.01216
Heat fluxes through boundary parts: -1.342e+22 W, 1.859e+17 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: 2.138e-07 Pa, 1.029e+11 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.01216
Writing graphical output: output-S20RTS/solution/solution-01216
a few more steps before crashing, see the change of iteration numbers for both temperature and Stokes
*** Timestep 1211: t=1.19007e+08 years, dt=31307.7 years
Solving temperature system... 25 iterations.
Rebuilding Stokes preconditioner...
Solving Stokes system... 200+15 iterations.
Postprocessing:
RMS, max velocity: 0.763 m/year, 9.01 m/year
Writing heat flux map: output-S20RTS/heat_flux.01211
Heat fluxes through boundary parts: -2.601e+17 W, 2.448e+14 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: 1.986e-06 Pa, 1.042e+11 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.01211
Writing graphical output: output-S20RTS/solution/solution-01211
*** Timestep 1212: t=1.19023e+08 years, dt=16513.3 years
Solving temperature system... 25 iterations.
Solving Stokes system... 200+15 iterations.
Postprocessing:
RMS, max velocity: 0.729 m/year, 13.7 m/year
Writing heat flux map: output-S20RTS/heat_flux.01212
Heat fluxes through boundary parts: -2.784e+17 W, -1.715e+14 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: -1.501e-06 Pa, 1.042e+11 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.01212
Writing graphical output: output-S20RTS/solution/solution-01212
*** Timestep 1213: t=1.19032e+08 years, dt=8431.86 years
Solving temperature system... 24 iterations.
Solving Stokes system... 200+14 iterations.
Postprocessing:
RMS, max velocity: 0.759 m/year, 29.9 m/year
Writing heat flux map: output-S20RTS/heat_flux.01213
Heat fluxes through boundary parts: -2.961e+17 W, -3.893e+13 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: 3.231e-06 Pa, 1.042e+11 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.01213
Writing graphical output: output-S20RTS/solution/solution-01213
*** Timestep 1214: t=1.19036e+08 years, dt=4260.65 years
Solving temperature system... 27 iterations.
Solving Stokes system... 150+0 iterations.
Postprocessing:
RMS, max velocity: 4.92 m/year, 441 m/year
Writing heat flux map: output-S20RTS/heat_flux.01214
Heat fluxes through boundary parts: 5.12e+19 W, -1.92e+13 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: 3.711e-07 Pa, 1.038e+11 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.01214
Writing graphical output: output-S20RTS/solution/solution-01214
*** Timestep 1215: t=1.19036e+08 years, dt=355.013 years
Solving temperature system... 33 iterations.
Solving Stokes system... 146+0 iterations.
Postprocessing:
RMS, max velocity: 7.29 m/year, 462 m/year
Writing heat flux map: output-S20RTS/heat_flux.01215
Heat fluxes through boundary parts: -8.097e+18 W, 3.922e+16 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: 3.694e-06 Pa, 1.038e+11 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.01215
Writing graphical output: output-S20RTS/solution/solution-01215
Thank you. Yes, it looks like the solution it blowing up.
@zjiaqi2018 Can you try to add "Time steps between mesh refinement = 0" in the mesh refinement section? It seems that the mesh gets adapted/coarsened over time (is that true?). If that helps, please open a PR to add it to the prm.
It seems that the mesh gets adapted/coarsened over time (is that true?).
True, a few steps before diverging, the dofs is only
Number of degrees of freedom: 80,741 (58,485+2,761+19,495)
compared to the initial dofs:
Number of degrees of freedom: 215,962 (156,774+6,930+52,258)
still diverges, but a later time:
*** Timestep 1585: t=1.97014e+08 years, dt=126.669 years
Solving temperature system... 96 iterations.
Solving Stokes system... 28+0 iterations.
Postprocessing:
RMS, max velocity: 355 m/year, 1.47e+04 m/year
Writing heat flux map: output-S20RTS/heat_flux.01585
Heat fluxes through boundary parts: -3.484e+17 W, 6.25e+21 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: -7.894e-07 Pa, 9.271e+10 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.01585
*** Timestep 1586: t=1.97014e+08 years, dt=7.30192 years
Solving temperature system... retrying linear solve with different preconditioner...
@jdannberg Should we try one more refinement next?
one more refinement level seems to resolve the issue. Now dt is much larger, which means the solution is not blowing up, and the numbers of iterations are good. Is t=2e+08 years long enough?
*** Timestep 887: t=2e+08 years, dt=172157 years
Solving temperature system... 18 iterations.
Solving Stokes system... 13+0 iterations.
Postprocessing:
RMS, max velocity: 0.0672 m/year, 0.34 m/year
Writing heat flux map: output-S20RTS/heat_flux.00887
Heat fluxes through boundary parts: -3.919e+13 W, 6.326e+13 W
Density at top/bottom of domain: 3359 kg/m^3, 3260 kg/m^3
Pressure at top/bottom of domain: 1.068e-09 Pa, 9.614e+10 Pa
Computing dynamic topography
Writing geoid anomaly: output-S20RTS/geoid_anomaly.00887
Writing graphical output: output-S20RTS/solution/solution-00881
I think increasing the resolution seems like a good solution for @fadhli-atarita's original issue. And it's great to see that it works! Since the cookbook itself is not intended to be time-dependent, I would only change the resolution there if it can still be run on a laptop in a reasonable time with the higher resolution.
We would be at 1.6 million DoFs, right? That might be too much for this cookbook.
I would suggest we only make the former change.
We would be at 1.6 million DoFs, right?
Right. I will open a PR.
Thank you all. I'll try this and let you know if I still encounter similar problems.
Also just to be sure, there is no version issue for this particular problem, right? v2.4.0 is currently being installed on the clusters that I use and we don't know how that will go. This means I still have to use v.2.20 for now.
Hello,
I've been trying to run this cookbook for a while but I've always encountered convergence issues. I've been using the exact same parameters as in the S40RTS.prm and have tried to increase the refinements and maximum stokes solver steps and decrease the linear solver tolerance. However, these changes also produced convergence issues.
Here are the output, error, and solver history files for your reference
solver_history.txt error_5214159.log output_5214159.log
Thank you,
Fadhli