Cantera / cantera

Chemical kinetics, thermodynamics, and transport tool suite
https://cantera.org
Other
624 stars 350 forks source link

more detailed debugging outputs, solver documentation, and high loglevel print improvements #1759

Closed wandadars closed 3 months ago

wandadars commented 3 months ago

Current work for a potential improvement for the damped newton step function as well as an improvement in the debugging output that the 1D solver generates when loglevel is set to 8. For the debugging outputs, a more comprehensive history of the solution steps that result in a failure is kept in the debugging file, which can then be examined to find where a solution started to diverge

Checklist

wandadars commented 3 months ago

I realized that I hadn't been doing the proper thing with regards to the line search, and so for now I just reverted back to the original method. I added some documentation about the method as it was clearly explained in the Kee book.

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 37.86408% with 64 lines in your changes missing coverage. Please review.

Project coverage is 73.19%. Comparing base (06261ed) to head (216ffd8). Report is 7 commits behind head on main.

Files Patch % Lines
src/oneD/MultiNewton.cpp 36.95% 25 Missing and 4 partials :warning:
src/oneD/OneDim.cpp 22.58% 16 Missing and 8 partials :warning:
src/oneD/Sim1D.cpp 57.69% 7 Missing and 4 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1759 +/- ## ========================================== - Coverage 73.23% 73.19% -0.04% ========================================== Files 381 381 Lines 54343 54371 +28 Branches 9246 9248 +2 ========================================== + Hits 39796 39799 +3 - Misses 11573 11593 +20 - Partials 2974 2979 +5 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

wandadars commented 3 months ago

Based on the description of the Newton solver in the Kee book, there was a thing that didn't quite make sense to me in the damped newton solver where fbound was multiplied by the damping factor. I think that fbound just serves as the starting point for the correction vector length, but in the implementation it was always multiplying the damping factor by the fbound, which I think might have artificially reduced the damping factor lower than necessary.

wandadars commented 3 months ago

I may have gone overboard on the solver print statements. The logging outputs when loglevel was set to a high value were...quite perplexing in their current form.

wandadars commented 3 months ago

For anyone interested, I created a simple 2 equation, 1 point control model problem that I was using just to examine the behavior of the algorithm on a small-scale problem.

import numpy as np
import matplotlib.pyplot as plt

control_point_index = 3

# The equation to solve is Laplace's equation in 1D: d^2T/dx^2 = 0
# The dummy equation for A is dA/dx = 0 at all points, except the control point,
# where A = T - UserVal
def residuals(X, UserVal, num_nodes):
    T = X[:num_nodes]
    A = X[num_nodes:]
    R = np.zeros(len(X))

    # Main equation residuals (discretized Laplace's equation)
    for i in range(1, len(T) - 1):
        R[i] = T[i - 1] - 2 * T[i] + T[i + 1]

    # A equation residuals
    for i in range(len(A)):
        if i == control_point_index:  # Control point at the specified node
            R[num_nodes + i] = T[i] - UserVal
        elif i < control_point_index:  # upwind towards the control point
            R[num_nodes + i] = A[i] - A[i + 1]
        else:  # downwind away from the control point
            R[num_nodes + i] = A[i] - A[i - 1]

    # Boundary condition residuals
    R[0] = T[0] - A[0]  # Left boundary at Node 0
    R[num_nodes-1] = T[-1] - 15  # Right boundary at the last node

    # Printing the residuals in a detailed way
    print("Residuals:")
    for i, r in enumerate(R):
        print(f"R[{i}] = {r:.4e}")

    return R

# Here we need to numerically evaluate the Jacobian matrix
# The Jacobian matrix is the matrix of partial derivatives of the residuals with respect to X (T and A combined).
# We approximate it numerically by finite differences.
def jacobian(X, UserVal, num_nodes):
    n = len(X)
    J = np.zeros((n, n))

    delta = 1e-6
    for i in range(n):
        X_plus = X.copy()
        X_minus = X.copy()

        X_plus[i] += delta
        X_minus[i] -= delta

        R_plus = residuals(X_plus, UserVal, num_nodes)
        R_minus = residuals(X_minus, UserVal, num_nodes)

        J[:, i] = (R_plus - R_minus) / (2.0 * delta)

    # Printing the Jacobian matrix in a detailed way
    print("Jacobian Matrix:")
    for row in J:
        print(" ".join(f"{val: .4e}" for val in row))

    return J

def newtons_method_multi(residuals, jacobian, X0, UserVal, num_nodes, tolerance=1e-6, max_iterations=25):
    X_n = X0.copy()

    epsilon = 1e-8
    for i in range(max_iterations):
        F_n = residuals(X_n, UserVal, num_nodes)
        J_n = jacobian(X_n, UserVal, num_nodes)

        # Regularize the Jacobian if necessary
        try:
            delta_x = np.linalg.solve(J_n + epsilon * np.eye(J_n.shape[0]), -F_n)
        except np.linalg.LinAlgError:
            print("Jacobian matrix is singular, regularization failed to resolve.")
            return X_n

        # Print out solution vector to screen
        print(f"\nSolution Vector before iteration {i+1}:")
        for j, x in enumerate(X_n):
            print(f"X[{j}] = {x:.6f}")

        # Plot the solution vector for Temperature and A
        plt.plot(X_n[:num_nodes], label='Temperature')
        plt.plot(X_n[num_nodes:], label='A')
        plt.legend()
        plt.show()

        # Update X
        X_n += delta_x

        # Check for convergence
        if np.linalg.norm(delta_x) < tolerance:
            print(f'Converged to solution after {i+1} iterations.')
            return X_n

    print('Did not converge.')
    return X_n

num_nodes = 6
UserVal = 220.0 # Desired value of T at the control point

# Initialize T to be between 5 and 15, linearly increasing from left to right
T0 = np.linspace(200, 400, num_nodes)
A0 = np.ones(num_nodes)
X0 = np.concatenate([T0, A0])

# Solve using Newton's method in matrix form
solution_X = newtons_method_multi(residuals, jacobian, X0, UserVal, num_nodes)

# Separate the solution vector into T and A
solution_T = solution_X[:num_nodes]
solution_A = solution_X[num_nodes:]

# Print the solution vectors in a clean and interpretable way
print("Solution T:")
for i, t in enumerate(solution_T):
    print(f"T[{i}] = {t:.6f}")

print("\nSolution A:")
for i, a in enumerate(solution_A):
    print(f"A[{i}] = {a:.6f}")
wandadars commented 3 months ago

One strange thing that I notice is that if we add the temporal term of the continuity equation to the continuity residual, the standard non-two-point control solve() fails. This seems odd to me. The time component from the Kee book is just d(rho)/dt

            // For "axisymmetric-flow", the continuity equation  propagates the
            // mass flow rate information to the left (j+1 -> j) from the value
            // specified at the right boundary. The lambda information propagates
            // in the opposite direction.
            //rsd[index(c_offset_U,j)] = -(rho_u(x,j+1) - rho_u(x,j))/m_dz[j]
            //                           -(density(j+1)*V(x,j+1) + density(j)*V(x,j));
            rsd[index(c_offset_U,j)] = -(rho_u(x,j+1) - rho_u(x,j))/m_dz[j]
                                       -(density(j+1)*V(x,j+1) + density(j)*V(x,j));
            rsd[index(c_offset_U,j)] -= rdt * (m_rho[j] - m_rho_prev[j]);

            diag[index(c_offset_U, j)] = 1; // Algebraic constraint
ischoegl commented 3 months ago

For anyone interested, I created a simple 2 equation, 1 point control model problem that I was using just to examine the behavior of the algorithm on a small-scale problem. [...]

Examples like this could be added under samples/cxx. We have a Blasius example there that illustrates Cantera's BoundaryValueProblem class.

speth commented 3 months ago

One strange thing that I notice is that if we add the temporal term of the continuity equation to the continuity residual, the standard non-two-point control solve() fails. This seems odd to me. The time component from the Kee book is just d(rho)/dt

The time-dependent solver only supports terms that end up on the diagonal of the Jacobian (hence the name of the variable diag, so there's no way to add a $d\rho/dt$ term to the solver.

wandadars commented 3 months ago

One strange thing that I notice is that if we add the temporal term of the continuity equation to the continuity residual, the standard non-two-point control solve() fails. This seems odd to me. The time component from the Kee book is just d(rho)/dt

The time-dependent solver only supports terms that end up on the diagonal of the Jacobian (hence the name of the variable diag, so there's no way to add a d ρ / d t term to the solver.

Ah yep. That's definitely it. 🤦‍♂️

wandadars commented 3 months ago

If I take the standard solver, and call solve() with successively higher boundary mdot values, going from a starting value of 0.5 to around 7, and then I activate the two-point control, it has no problems with lowering the temperature control points down. Even an increment of 10 K converges without issues ( used to crash at an increment of 3K). This is all at high pressure (5.4e6 Pa). So maybe there's a possibility that the newton solver is seeing some sort of solution at a negative left boundary velocity for low boundary velocities. I'm not sure if there's anything that can be done about that.

wandadars commented 3 months ago

I may be doing something wrong with the Python interface. To set the value of the m_mdot variable at the left and right boundaries for the 1D counterflow diffusion flame, we use the f.fuel_inlet.mdot and f.oxidizer_inlet.mdot properties correct?

In the script below, I have an initial solution at the 5.4e6 Pa, solved using the auto=True option. I then increment the values of the mdot variables at the boundaries by a factor of 1.25 and attempt to re-solve. I get printout results that seem to match what I am doing. The strange thing is that if I examine the backup_2.yaml file after I kill the code inside one of the loop iterations when mdot is being increased, the entry for the boundary looks like:


  fuel_inlet:
    type: inlet
    size: 1
    points: 1
    mass-flux: 5.895811775787321e+07
    temperature: 800.0
    pressure: 5.4e+06
    mass-fractions:
      H2: 1.0
from pathlib import Path
import cantera as ct
import numpy as np
import matplotlib.pyplot as plt

p = 5.4e6 # pressure [Pascals]
#p = 101325 # pressure [Pascals]
#p = 7*101325 # pressure [Pascals]

fuel_inlet_temp= 800.0  # fuel inlet temperature [Kelvin]
oxidizer_inlet_temp= 711.0  # oxidizer inlet temperature [Kelvin]
fuel_mdot= 0.5 # kg/m^2/s
width= 50.0e-3 # Distance between inlets [m]

oxidizer_composition= 'O2:1.0'  # oxidizer composition (right)
fuel_composition= 'H2:1.0'  # fuel composition (left)

# Reaction mechanism
mechanism= 'h2o2.yaml'

# Define output locations
output_path = Path() / "testing_output"
output_path.mkdir(parents=True, exist_ok=True)

gas = ct.Solution(mechanism)

gas.TPY = fuel_inlet_temp, p, oxidizer_composition
density_f = gas.density

gas.TPY = oxidizer_inlet_temp, p, oxidizer_composition
density_o = gas.density

#Unity Lewis number testing
gas.transport_model = 'unity-Lewis-number'

f = ct.CounterflowDiffusionFlame(gas, width=width)

f.set_refine_criteria(ratio=15, slope= 0.15, curve= 0.08, prune= 0.04)
#f.flame.set_steady_tolerances(default=(1e-8,1e-15))
#Set the state of the two inlets
f.fuel_inlet.mdot = fuel_mdot
f.fuel_inlet.X = fuel_composition
f.fuel_inlet.T = fuel_inlet_temp

#Create a guestimate for what the oxidizer mdot would be
f.oxidizer_inlet.mdot = (fuel_mdot / density_f) * density_o*4
f.oxidizer_inlet.X = oxidizer_composition
f.oxidizer_inlet.T = oxidizer_inlet_temp

#'''
# Generate initial condition
f.solve(auto=True, loglevel=0)
f.save('backup.yaml', name="solution", overwrite=True)

print('mdot info:')
print('Fuel mdot: ' + str(f.fuel_inlet.mdot))
print('Oxidizer mdot: ' + str(f.oxidizer_inlet.mdot))
print('BC State:')
print('Left U: ' + str(f.velocity[0]))
print('Right U: ' + str(f.velocity[-1]))
#'''

f.restore('backup.yaml')

#'''
# Generate a suite of strained flames until extinction
while(True):

    fuel_mdot = f.fuel_inlet.mdot
    oxidizer_mdot = f.oxidizer_inlet.mdot

    fuel_mdot *= 1.25
    oxidizer_mdot *= 1.25

    print('New fuel mdot: ')
    print(fuel_mdot)

    f.fuel_inlet.mdot = fuel_mdot
    f.oxidizer_inlet.mdot = oxidizer_mdot
    try:
        f.solve(loglevel=0)
    except ct.CanteraError as e:
        print('Solver did not converge. Stopping.')
        print('Error:')
        print(e)
        break
    f.save('backup_2.yaml', name="solution", overwrite=True)
#'''

f.restore('backup_2.yaml')

print('Starting two-point control')
f.two_point_control_enabled = True

spacing = 0.95
temperature_increment = 0 # Kelvin
maximum_temperature = []
a_max = []
fuel_mdot_history = []
oxidizer_mdot_history = []
control_temperature = np.min(f.T) + spacing*(np.max(f.T) - np.min(f.T))
print(f'Control temperature = {control_temperature} K')
f.set_left_control_point(control_temperature)
f.set_right_control_point(control_temperature)

f.left_control_point_temperature -= temperature_increment
f.right_control_point_temperature -= temperature_increment
f.solve(loglevel=8)
f.save('backup_twoPoint.yaml', overwrite=True)

#f.restore('backup_twoPoint.yaml')
temperature_increment = 10 # Kelvin
f.left_control_point_temperature -= temperature_increment
f.right_control_point_temperature -= temperature_increment
f.solve(loglevel=8,refine_grid=False)
ischoegl commented 3 months ago

In the script below, I have an initial solution at the 5.4e6 Pa, solved using the auto=True option. I then increment the values of the mdot variables at the boundaries by a factor of 1.25 and attempt to re-solve. I get printout results that seem to match what I am doing. The strange thing is that if I examine the backup_2.yaml file after I kill the code inside one of the loop iterations when mdot is being increased, the entry for the boundary looks like:

How recently have you rebased? This may be an artifact of something that @speth fixed recently in #1740 (specifically #1629). Essentially, some updated data were overwritten by older data during serialization.

wandadars commented 3 months ago

In the script below, I have an initial solution at the 5.4e6 Pa, solved using the auto=True option. I then increment the values of the mdot variables at the boundaries by a factor of 1.25 and attempt to re-solve. I get printout results that seem to match what I am doing. The strange thing is that if I examine the backup_2.yaml file after I kill the code inside one of the loop iterations when mdot is being increased, the entry for the boundary looks like:

How recently have you rebased? This may be an artifact of something that @speth fixed recently in #1740 (specifically #1629). Essentially, some updated data were overwritten by older data during serialization.

I will try that. The issue seems to be that, even though I have specified that overwrite=True option, the backup_2.yaml file is not being overwritten. The large values were from a previous run that ran up past the extinction mdots and went to very high values of mdot during the loop.

wandadars commented 3 months ago

If I take the standard solver, and call solve() with successively higher boundary mdot values, going from a starting value of 0.5 to around 7, and then I activate the two-point control, it has no problems with lowering the temperature control points down. Even an increment of 10 K converges without issues ( used to crash at an increment of 3K). This is all at high pressure (5.4e6 Pa). So maybe there's a possibility that the newton solver is seeing some sort of solution at a negative left boundary velocity for low boundary velocities. I'm not sure if there's anything that can be done about that.

I spoke too soon on this. Examining the solution, the left boundary is still going negative, it just is somehow converging to a solution that has a negative u at the left boundary. Somehow that dF(u)/dLambda is getting flipped around.

wandadars commented 3 months ago

Ok, I think I got the line lengths down a bit. I made a new compressions, such as just saying "log(ss)", because we are really just looking at the change in that value, and not the magnitude for the most part. We compare to other values, and so the exact type of logarithm didn't seem critical.

I have some long lines, but I'm not sure what the best way to split them would be so as to not make reading the code overly complicated. I might be missing some fmt-fu that would help to better handle the output positioning information. I've reviewed the different levels from 0 to 6 and I think they behave more clearly compared to the old outputs.

wandadars commented 3 months ago

Thank you for the review comments @speth . One thing that was in my mind while adjusting these print statements was the question: "What is the guiding principle of the log levels here?"

One thing that I tried to do was if a function gets a log level of 1, then it can be succinct in its summary because the statements don't have to worry about lower-level function calls injecting their own print statements into the flow of things. If the log level is higher than that, then the outputs at that level need to be expressed in such a way that they can still be understood even with a bunch of extra statements in-between.

I adjusted where the loglevel was decremented ( I had it essentially decrementing during each nested call down into a function from the highest solve() level before). We have the steady newton summary and the timestep at logleve 2 now. I moved the full state output to one level higher.

speth commented 3 months ago

One thing that I tried to do was if a function gets a log level of 1, then it can be succinct in its summary because the statements don't have to worry about lower-level function calls injecting their own print statements into the flow of things. If the log level is higher than that, then the outputs at that level need to be expressed in such a way that they can still be understood even with a bunch of extra statements in-between.

Yes, I understood that to be one of your main goals with the modifications to the logging, and I think what you have here is an improvement in that regard, by making some of the individual log records more self-contained and by the adjustments to the formatting of the "tabular" output.

wandadars commented 3 months ago

@speth I have that script that parses and plots the output from the debug_sim1d.yaml file. Is that something worth including somewhere in Cantera? It's probably more of a tool for developers versus users.

speth commented 3 months ago

Do you have a link to the script? I think you posted some version of it in one of these threads, but I can't find it now. I think it would be worth preserving in some form. Perhaps on a page within the "Develop" section of the docs (https://cantera.org/dev/develop/index.html) along with some other suggestions on how to dig into unexpected issues with using the 1D code.

wandadars commented 3 months ago

Do you have a link to the script? I think you posted some version of it in one of these threads, but I can't find it now. I think it would be worth preserving in some form. Perhaps on a page within the "Develop" section of the docs (https://cantera.org/dev/develop/index.html) along with some other suggestions on how to dig into unexpected issues with using the 1D code.

For reference (when I get around to adding a page on the script), this is the script.

import matplotlib.pyplot as plt
import yaml
import sys
import os
import re

'''
This script reads the debug output from Cantera when the loglevel is set to 7 or 8.
The data that is output is the history of the previous attempts of the solve()
method's residual and solution vectors.
'''

# Load YAML data
def load_yaml_data(file_path):
    """Load and return the YAML data from the given file path."""
    try:
        with open(file_path, 'r') as file:
            return yaml.safe_load(file)
    except FileNotFoundError:
        print(f"Error: The file '{file_path}' does not exist.")
        sys.exit(1)
    except yaml.YAMLError as exc:
        print(f"Error: Failed to parse YAML file '{file_path}'.\n{exc}")
        sys.exit(1)

# Plot component data against grid points and save the plot
def plot_component(grid, component_data, component_name, data_type, plot_dir):
    """Plot a component's data against grid points and save the plot."""
    plt.figure()

    # Plot line and scatter points
    plt.plot(grid, component_data, label=component_name)
    plt.scatter(grid, component_data, color='red', marker='o', label=f'{component_name} (data points)')

    plt.xlabel('Grid Points')
    plt.ylabel(component_name)
    plt.title(f'{component_name} {data_type} vs Grid Points')  # Include data type
    plt.legend()

    plot_filename = os.path.join(plot_dir, f'{component_name}_{data_type}_vs_grid.png')
    plt.savefig(plot_filename)
    plt.close()

def main():
    if len(sys.argv) < 2:
        print("Usage: python script.py <path_to_yaml_file>")
        sys.exit(1)

    file_path = sys.argv[1]
    data = load_yaml_data(file_path)

    all_data = {}
    for key, value in data.items():
        match = re.match(r'(solution|residual)_(\d+)_(.+)', key)
        if match:
            data_type, solution_num, stage = match.groups()
            solution_num = int(solution_num)
            if solution_num not in all_data:
                all_data[solution_num] = {}
            if stage not in all_data[solution_num]:
                all_data[solution_num][stage] = {}
            all_data[solution_num][stage][data_type] = value['flame']

    for solution_num, stages in all_data.items():
        print(f'Plotting data for solve attempt: {solution_num}')

        for stage, data_types in stages.items():
            for data_type, oneD_data in data_types.items():
                components = oneD_data.get('components')
                grid = oneD_data.get('grid')

                if not components or not grid:
                    print(f"Warning: Missing 'components' or 'grid' data for solution {solution_num}, stage {stage}, type {data_type}. Skipping.")
                    continue

                # Separate directories for "solution" and "residual"
                plot_dir = os.path.join('debug_plots', f'solution_{solution_num}', stage, data_type)
                os.makedirs(plot_dir, exist_ok=True)

                for component in components:
                    if component in oneD_data and component != 'grid':
                        plot_component(grid, oneD_data[component], component, data_type, plot_dir)

                print(f"Plots for {data_type} {solution_num}, stage {stage}, type {data_type} saved in directory: {plot_dir}")

if __name__ == "__main__":
    main()