Open hugary1995 opened 3 months ago
@pbehne Here's the issue we discussed on slack.
@lindsayad, I'll work on this.
I have more information on this now.
The above MOOSE input file describes a transient diffusion problem with no source term and homogeneous Dirichlet boundary conditions. This problem has a constant solution: zero. MOOSE uses zero - the solution - as the initial guess. The particular petsc nonlinear solver used for this problem is SNESSolve_NEWTONLS
, defined in petsc/src/snes/impls/ls/ls.c. The linear iteration loop of this solver starts on line 189 of the above link. However, note that prior to this, on lines 180-183, a convergence test is performed on the initial guess and if already converged, the function returns early instead of entering the linear iteration loop. This loop has to be entered for the ksp_converged_reason
option to be used. Otherwise, petsc complains that it is unused in PetscFinalize
like we see on the console when running the problem. If we add a nonzero Dirichlet BC on one of the boundaries, the unused option warning goes away since the linear iteration loop in SNESSolve_NEWTONLS
is entered.
A quick fix would be to change the default value of the print_linear_converged_reason
parameter, defined in CommonOutputAction.C, to false. I have a draft PR here that shows that doing this does not break anything in the test suite.
When the above parameter is false, Moose::PetscSupport::disableLinearConvergedReason
is called to remove the -ksp_converged_reason
from the petsc options. If there is some reason we don't want to change the default value of print_linear_converged_reason
, it shouldn't be too hard to accumulate the sum of the number of linear iterations over all nonlinear iterations. If this sum is 0, we can call Moose::PetscSupport::disableLinearConvergedReason
directly to remove the option from petsc.
HOWEVER, I am of the opinion that this is a bug in petsc and we shouldn't have to change MOOSE or libMesh to address it. Whether this warning is emitted depends on the completely unrelated boundary condition. Also, it is possible that ksp_converged_reason
is not the only petsc option affected by the early return, thus the above 2 solutions would not be thorough because they only address the ksp_converged_reason
option.
I'd recommend creating a petsc ticket to report this bug with a minimum working example petsc-only example. I would rather defer this to a more experienced petsc user on the MOOSE team, but can work on this if needed. Just let me know.
@lindsayad @roystgnr @oanaoana
This all makes sense to me. It then sounded to me like there is also no easy fix on the petsc side, is there?
As you said
it is possible that ksp_converged_reason is not the only petsc option affected by the early return
and indeed I think what @tophmatthews saw was a different one.
But I agree, this seems more like a petsc issue.
While I don't know petsc's source in much detail, conceptually, it should be as simple as adding and tracking an early convergence flag and for every ksp-related option, setting the corresponding entry in the PetscOptions' used
vector to true so they don't trigger a warning. My hope is that petsc developers agree that this is a bug, fix it on their end, and MOOSE sees the benefits when we start using the next version of petsc.
Agreed. I'll let you find find a more experienced petsc user in the moose team :)
I'm not convinced it's a PETSc issue. PETSc doesn't complain about -pc_type lu
in @hugary1995's example
Additionally
lindad@pop-os:~/projects/moose-no-cuda/test(TimedSubdomainModifier2)$ ./moose_test-devel -i options-left.i -ksp_monitor -ksp_rtol 1e-5
:
:
:
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There is one unused database option. It is:
Option left: name:-ksp_converged_reason value: ::failed source: code
I'm more convinced it's a PETSc issue. Interesting that -ksp_converged_reason
seems to be somewhat unique.
./moose_test-devel -i options-left.i Outputs/print_linear_converged_reason=false -ksp_converged_reason -ksp_monitor_singular_value -ksp_monitor_true_residual -ksp_monitor -ksp_rtol 1e-5 -ksp_atol 1e-5 -ksp_view -ksp_view_pmat -ksp_monitor_solution
only leads to warnings about -ksp_converged_reason
Indeed, most options are handled straightaway in KSPSetFromOptions
which always gets called from SNESetFromOptions
pre-solve. But -ksp_converged_reason
is not touched until KSPConvergedReasonViewFromOptions
which is only called at the end of a KSPSolve
. I still hesitate to call this a bug...
I'll ask on the PETSc discord about the best way to eliminate the warning
Yeah, "bug" is probably not the best label. Maybe "minor annoyance that does not affect solution accuracy and confuses users" is more appropriate.
So this will stop being an issue when MOOSE updates the version of PETSc is uses. How do we close these kinds of issues? Should it be closed now, or when we update the petsc submodule? If the later, how do we remember to close this?
We should wait to close it until the submodule has been updated such that users are no longer experiencing the issue. How to remember is a good question :smile:
Makes sense. I'll put a recurring task on my todo list to check back on this so we don't forget to close the issue.
It looks like @cticenhour usually performs the petsc submodule updates. Casey, any idea when we are due for another petsc update?
@milljm, once the petsc submodule is updated, how long does it usually take to propagate into the conda packages?
First thing is to see MR7640 actually merged into release...
We have just updated the petsc version used in MOOSE to petsc 3.21.4, but this version does not seem to contain the fix. I'll keep monitoring our future petsc updates.
@pbehne In that case, do you mind making an announcement on the moose discussion page explaining this warning message?
@lindsayad, do you think this warrants an announcement in discussions? If so, what category? None of them fit very well. I'm inclined to say that the open issue is already a sufficient means to communicate the warning message while we wait for the fix to be incorporated. This issue is one of the top results when searching for "-ksp_converged_reason" in the MOOSE repository.
Yea @hugary1995 I don't really see a reason to make an issue a discussion. Users can monitor issues just as well as discussions
k, it's up to you apparently
You can make one if you wish 😄
Bug Description
Running some input files give me warnings like below
Steps to Reproduce
Run the following MRE with moose_test-opt
Impact
This is mostly just annoyance, but could potentially mislead users to believe something went wrong.
[Optional] Diagnostics
No response