OPM / opm-simulators

OPM Flow and experimental simulators, including components such as well models etc.
http://www.opm-project.org
GNU General Public License v3.0
125 stars 121 forks source link

Convergence monitors #5590

Closed jakobtorben closed 1 month ago

jakobtorben commented 2 months ago

Implement Convergence Monitoring

This PR introduces a new convergence monitoring feature to improve the robustness and efficiency of the simulator. It is based on the following publication:

Lie, K., Moyner, O., Klemetsdal, Ø., Skaflestad, B., Moncorgé, A., & Kippe, V. (2024). Enhancing Performance of Complex Reservoir Models via Convergence Monitors. ECMOR, 2024(1), 1-9. (https://doi.org/10.3997/2214-4609.202437057)

The convergence monitoring system tracks the convergence behaviour across iterations, applying penalties for non-convergence. If the total penalty count exceeds the specified cut-off limit, the simulator will cut the timestep.

This feature allows early-exiting for steps that are not converging, saving wasted iterations and assembly.

This is the first version that will be iterated on before considering to merge it.

jakobtorben commented 2 months ago

This first version implements the following convergence monitors:

If the total penalty cards if above a given cut-off limit (default 30), cut the timestep.

I tested this on Norne, where we observe a slight decrease in nonlinear and linear iterations. But other cases are probably more suited since Norne does not fail a lot to begin with. (Ignore the zero wasted, I am not exiting the timestep cut gracefully yet).

image

jakobtorben commented 2 months ago

In the second version, the implementation should be the same as the paper, given that the tolerance for adding a card for too large well residual, is the same as OPM default such that the well is unconverged. The reporting has also been fixed, such that the wasted iterations coming from the convergence monitoring is counted. (This fix also involved making sure that failed iterations from NaN and too large residuals errors are counted).

When tested on Norne, the results are currently similar to without using convergence monitoring

image

jakobtorben commented 2 months ago

Fixed some bugs and added the penalty counts to the INFOITER file for analysis.

The INFOITER file was used to analyse the convergence behaviour and cut-off values, using a tool similar to the paper.

Norne_step_275

Using this tool we can also estimate the number of iterations saved if using convergence monitoring. Which can be used to find the optimal parameters to use for a specific case:

NORNE_fraction_of_iterations_remaining_after_early_exit

And number of incorrectly aborted timesteps:

NORNE_number_of_incorrectly_aborted_timesteps

Optimal parameters found at cut-off 14 and distance decay factor 0.60, which should give an estimated number of Newton iterations as a factor of 0.989. The small reduction is likely due to Norne not failing a lot to begin with for OPM.

Using these optimal parameters, we can run OPM with convergence monitoring to see if we get any improvements. Here run with relaxed CNV and MB tol equal to original tol to match the format used in the analysis tool.

image

From the results, we see a small reduction in Newton iterations and runtime, as expected from our analysis. Better results are likely achieved on cases with more failed timesteps.

jakobtorben commented 1 month ago

Depends on https://github.com/OPM/opm-common/pull/4244.

jakobtorben commented 1 month ago

Note that the well convergence metric is not used at the moment. But the plan is to also include well convergence metrics in the convergence monitoring. However, this requires two things:

jakobtorben commented 1 month ago

jenkins build this opm-common=4244 please

jakobtorben commented 1 month ago

jenkins build this please

atgeirr commented 1 month ago

All good, all green! Merging.