Belos: Scaling with small initial residual norm

trilinos / Trilinos

Primary repository for the Trilinos Project

https://trilinos.org/

Other

1.22k stars 570 forks source link

Belos: Scaling with small initial residual norm #7669

Closed cgcgcg closed 2 years ago

cgcgcg commented 4 years ago

Question

@trilinos/belos

Situation: We're using a Krylov solver to solve a time-dependent problem. The tolerance gets scaled by the initial residual norm in each time step. Since the time-dependent problem has some ramp-up phase, for some time steps the initial residual is really small, but non-zero. Effectively, this means that the tolerance after scaling can be way below machine tolerance, and bad things happen.

I'm wondering if there should be a check to see if tolerance*scaling is below machine epsilon, and option to adjust the tolerance in that case.

vbrunini commented 4 years ago

We're also interested in this for Sierra.

csiefer2 commented 4 years ago

Can do this with the existing composite convergence criterion for Belos?

cgcgcg commented 4 years ago

I'd think so. So instead of maxIterTest or convTest(tolerance), we'd use maxIterTest or convTest(tolerance) or convTest(machineEps)?

cgcgcg commented 4 years ago

I'm just worried about users that like to set convergence criteria of 1e-100, in the hope that this somehow gives them a better solution. This would definitely change their runs.

hkthorn commented 4 years ago

I would not do this by default. This would have to be an option for applications to enable.

dridzal commented 4 years ago

@cgcgcg , what's wrong with having an absolute convergence tolerance of 1e-100 if the effective relative tolerance is 1e-10? Are you referring to relative tolerances of 1e-100 (which, I agree, are not wise, unless we are working with arbitrary-precision arithmetic)? In any case, the linear solvers should be scale invariant. Otherwise, our nonlinear solver and optimization stack is in big trouble. On a separate note, we have also observed inconsistent iteration numbers during the ramp-up phase for a transient code that's a derivative of miniEM, but we can trace this behavior back to using a mixture of relative and absolute convergence criteria in subsolvers.

cgcgcg commented 4 years ago

@dridzal Yes, relative tolerance of 1e-100.

dridzal commented 4 years ago

Agreed, there should probably be some default safeguard, tied to the epsilon given by working precision (to support multiprecision applications).

github-actions[bot] commented 2 years ago

This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity. If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE label. If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE. If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.

github-actions[bot] commented 2 years ago

This issue was closed due to inactivity for 395 days.