Open wigging opened 3 years ago
The comments below are reposted from an Issue on the SciPy repository about the TR-BDF2 algorithm. The comments are from @laurent90git and are related to code in this repo.
I've quickly tested it. adding printouts of the time and time step in solve_ivp shows that the solution is actually advancing. I ran out of patience after a few minutes, having reached t~1ms. As your system is likely very stiff, at least during the initial transient, explicit methods perform very poorly and will explode/overflow if you don't use tight integration tolerances. They will anyway be very inefficient.
By the way, I see that you just call solve_ivp without any integration-related parameters (rtol, atol). I think you should take a closer look at these parameters and.their impact on the solution quality.
With implicit methods, every integration step is very long, because, as your system is quite nonlinear, the Jacobian of your ODE system must be computed anew very often. As your system is large (1500 ODEs) and scipy does not have any information regarding its sparsity pattern, the solver calls your ODE function dydt at least 1500 times for each Jacobian update... This results in very slow performance.
One thing I noticed is that, in your vector of unknowns, you have sorted the variables by type (all discrete rho_xx are contiguous for example). As the different fields interact with one another, this results in a Jacobian sparsity pattern that has components way off the diagonal. For one-dimensional models, it is usually best to sort all the variables according to the index of the mesh cell (or spatial point) that are related to. Then within each cell, the variables may be ordered as you wish, as long as the ordering is constant from one cell to the next. This way, the Jacobian will only have nonzero components very close to the diagonal. Consequently, much more efficient approaches can be used to compute this Jacobian without performing s as many function calls. You have something like 15 fields if I remember correctly, all discretise on the same mesh. In the ideal case, with the previous reordering of the unknowns, you could compute the Jacobian in 3*15=45 calls only, which is way better (I assume your spatial schemes only use a 3-points stencil, for exemple as in the second order centered finite difference for diffusive terms). This assumes that the Jacobian is not formed analytically but by finite-differences. A banded Jacobian may also enable the use of optimized linear solvers for the Newton loop, even though solve_ivp does not offer that level of customisation yet, as far as I am aware. I don't have time this weekend to edit your code to show you what I mean, but I'll try to do that next week!
Overall, as far as I've investigated, the issue is not linked to the lack of a performant integration method for stiff equations. I think it's just a combination of stiffness (requiring implicit integration), non-linearity (requiring repeated Jacobian updates) and large system size (inefficient Jacobian evaluation), all of which, added to Python's overall lack of performance (compared to other codes like Fortran or others) result in long computation times.
In the article you cited, I don't see any comment on the bad performance of other solvers (compared to ode23tb). Have you read that somewhere else?
Hi, I've had some more time to look into this. Regarding the base code, it actually works Setting atol=trol=1e-6, Radau has obtained a solution in ~220 min:
----------------------- Solver Info ------------------------
message: The solver successfully reached the end of the integration interval.
success: True
nfev: 64002
njev: 251
nlu: 834
----------------------- Results Info -----------------------
t0 0.0
tf 10.0
N 100
len t 8806
y shape (1500, 8806)
Execution time = 216m 44s
FYI, nlu
represents the number of LU-factorisations of the Radau-method iteration matrix, which involves the Jacobian of dydt
and the time step. Hence a new factorisation is performed, roughly, when the Jacobian is updated (because of poor convergence) or when the time step is changed. So here nlu>njev
is expected.
What's important here: 64002 evaluation of your ODE function dydt
, WITHOUT counting the calls to this function that were done to compute the Jacobian (which was updated 251 times). We have seen (see the other issue I opened regarding Jacobian update speed) that each Jacobian evaluation needs 1501 calls to dydt
.
Hence your dydt
was called 251*1501 + 640002 = 440 753 times !
This most likely represents the major part of the computation time ! Factorizing the iteration matrix and solving the non-linear systems arising from the Radau method is likely much faster (the problem size is still relatively small at 1500 unknowns).
Here are the output graphs. By the way, I think you should implement a way to automatically backup the solution results, as we currently lose everything once we exit the main()
function... It would have been interesting to see what the time step evolution looked like...
The thing to do now should be to improve the Jacobian update speed, therefore:
dydt
J
so as to be able to drastically decrease the number of function calls per Jacobian updateSome output graphs:
By the way, I ran a profiler on a single Jacobian update of dydt
which took 46s. 36s seconds were spent in your gas phase _calc_mix_props
, of which 20s were spent doing sums. Regular memory reallocation does not seem to be a problem, which is good news !
Well, it's good to know that a solution can be obtained but more than 3 hours to get that solution is ridiculous. When I run the Matlab version of the model it provides a solution in about 8 minutes. For now, I will look at the _calc_mix_props
function and see if I can get rid of the for-loops. After that I'll work on your other suggestions. Also, what did you use to profile the Python code?
Are you sure your Matlab model is exactly the same ?
It would be interesting that you access the number of function evaluations and Jacobian evaluations performed by òde23tb. I am not quite sure you can access them via the ODE solver output as in Python. Then maybe you could try using a global variable
ncallswhich you increment in your Matlab version of
dydt. Also, make sure your also use àtol=rtol=1e-6
so that the result is sort of comparable (it won't be as it is not the same integrator). Do you provide an analytical Jacobian or something similar to Matlab ?
Regarding your _calc_mix_props
function, I think the for loops are fine (even though you could maybe avoid them). The main issue is the repeated call to np.sum
.
By the way, the profiler I use is the one including in the "Spyder" IDE, which is similar to the Matlab IDE, but for Python.
Actually this gives me an idea: can you profile your Matlab code ? It will then show the number of function calls, and also the inner working of the ode23tb
solver, which would be quite interesting !
The Matlab model was used as the basis to develop the Python model. Other than the ODE solver, they are using the same system of equations and performing the same calculations. Below is the Matlab code which calls the ode23tb
solver function. I did not provide the solver with a Jacobian matrix. The only inputs to the solver are the function that represents the ODEs, the time span which is labeled as xspan
, and the initial conditions x0
.
tfin = 1000; % total time [s]
xspan = [0, tfin]; % time span [s]
rhob_b0 = 1e-12; % initial biomass concentration [kg/m³]
rhob_c0 = 0; % initial char concentration [kg/m³]
rhob_g0 = 0.15; % initial gas concentration [kg/m³]
% Initial conditions
X0(1:N) = 300; % solid fuel temperature, Ts [K]
X0(N+1:2*N) = 1100; % gas temperature, Tg [K]
X0(2*N+1:3*N) = rhob_b0; % biomass concentration, rhob_b [kg/m³]
X0(3*N+1:4*N) = ugin; % solid fuel velocity, v [m/s]
X0(4*N+1:5*N) = 0.2; % gas mass flux, mfg [kg/s-m^2]
X0(5*N+1:6*N) = rhob_g0; % bulk gas concentration, rhob_g [kg/m³]
X0(6*N+1:7*N) = rhob_g0; % steam concentration, rhob_h2o [kg/m³]
X0(7*N+1:7*N+Ni) = 1100; % inert particle (bed) temperature, Tp [K]
X0(7*N+Ni+1:8*N) = 0; % inert particle (bed) temperature in freeboard, Tp [K]
X0(8*N+1:9*N) = rhob_c0; % char concentration, rhob_c [kg/m³]
X0(9*N+1:10*N) = 0; % H2 concentration, rhob_h2 [kg/m³]
X0(10*N+1:11*N) = 0.0; % CH4 concentration, rhob_ch4 [kg/m³]
X0(11*N+1:12*N) = 0.0; % CO concentration, rhob_co [kg/m³]
X0(12*N+1:13*N) = 0.0; % CO2 concentration, rhob_co2 [kg/m³]
X0(13*N+1:14*N) = 0.0; % Tar concentration, rhob_t [kg/m³]
X0(14*N+1:15*N) = 0.0; % total amount of release char, rhob_ca [kg/m^3]
X0(15*N+1:16*N) = 1100; % wall temperature, Tw [K]
[t, X] = ode23tb(@bfb_gasifier, xspan, X0);
I have compared results from both the Matlab and Python versions after the first couple of time steps and everything looked similar. The trickiest part was making sure the indexing of the arrays was the same in Python as in Matlab. Matlab uses 1 for the first index and Python uses 0 for the first index.
I'll try to profile the Matlab code and get more information about what the solver is doing.
I ran the Matlab code using the following options for the ode23tb
as shown below. It converged to a solution in 1,525 seconds (25 minutes). Notice that this solution is for a time span of 1,000 seconds. The Python version of the model that is in this repo is set to a time span of 10 seconds for the solver.
opts = odeset('RelTol', 1e-6, 'AbsTol', 1e-6, 'Stats', 'on');
[t, X] = ode23tb(@bfb_gasifier, xspan, X0, opts);
The Matlab ode23tb
solver statistics after running the model are shown below.
74347 successful steps
136 failed attempts
272378 function evaluations
56 partial derivatives
1044 LU decompositions
256510 solutions of linear systems
Ok, it's funny, ode23tb takes 10 times more steps, but evaluates the Jacobian five times less... Maybe they have a different strategy where the time step is less often varied, but I haven't heard of such a strategy for this method... The fact that your solution goes up to 1000s instead of 10s is not so important, as the system has by then reached steady-state, therefore the time step will quickly increase and it won't cost much more to go from 10s up to 1000s.
Could you plot the time step evolution (log(diff(t)) as a function of physical time ? Does your solution look like the Python one ?
Also, could you profile the whole Matlab resolution and export the profile result as HTML ? https://fr.mathworks.com/help/matlab/ref/profsave.html
Below is a plot of the time steps where the y-axis is log scale. I also attached the profile results from Matlab which are available in the zip file. Open the file named "file0.html" to view the home page of the results. From there you should be able to dig into the functions and see the relevant lines of code.
Matlab profile results: profile_results.zip
@laurent90git Were you able to view the profile results?
I made several changes to my code since this issue was first posted. See the dynamic model content in the README for the latest information about the code.
I removed the for-loops for calculating the gas mixture properties which seems to speed up the overall solution process. Thanks @laurent90git for pointing out this issue.
I also removed the class objects in favor of using modules that contain just functions. This made it easier for me to code up the differential equations and other calculations.
The dynamic model actually reaches a solution now in about 16 minutes on my laptop. However, I get some overflow warnings while the solver is running (see below). Are there any suggestions on how to prevent these warnings from occurring?
dyn-bfbgasf/solid_phase.py:133: RuntimeWarning: overflow encountered in power
+ hps * (Tp - Ts)
dyn-bfbgasf/solid_phase.py:309: RuntimeWarning: overflow encountered in power
- hps * (Tp - Ts)
dyn-bfbgasf/solid_phase.py:393: RuntimeWarning: overflow encountered in power
qwr = np.pi * Dwi * epb / ((1 - ep) / (ep * epb) + (1 - ew) / ew + 1) * sc * (Tw**4 - Tp**4)
dyn-bfbgasf/gas_phase.py:230: RuntimeWarning: overflow encountered in multiply
SmgV = SmgG + Smgs * (ug + v) - (Smgp - Smgg) * ug - SmgF
----------------------- Solver Info ------------------------
message The solver successfully reached the end of the integration interval.
success True
nfev 63758
njev 291
nlu 1188
----------------------- Results Info -----------------------
t0 0.0
tf 1000.0
len t 8707
y shape (1600, 8707)
elapsed time 16m 22s
Hi @wigging sorry for the late reply. REgarding the profiling results, as far as I could tell, Matlab was actually much quicker at evaluating the ODE function, and also used less Jacobian updates, hence the better performance.
I am not quite sure there's more to do about your case. I guess you can try using better precompiler for Python (Numba for example), it might improve the overall performance, also in a similar case I personnalyy had no performance improvement... Anyway it seems it is now on paar with Matlab, right ?
Regarding the overflow warnings, it is likely due to the fact that the integrator sometimes takes steps that are too large and locally result in an incorrect Newton step which sort of diverges, causing overflow in your function... The Newton loop anyway does not converge so Radau tries the step once again with a lower time step such that this problem does not appear anymore, therefore your solution is not affected by this (just check that you don't have NaNs in your solution history, otherwise your solution is garbage). You could also reduce the likelihood of the problem appearing by lowering the integration tolerances rtol
and atol
, such that the time steps used are smaller. This will also increase the overall quality of the solution, but at the cost of an increased computational time. You can try playing around with the tolerances. 1e-6
is usually a good all-around value, but you can try lower.
Hey @laurent90git. Thank you for replying again. I was worried that you gave up on this issue. I really appreciate your help. And don't worry, I'm almost ready to close out this issue so I won't have to bother you again. Anyway, regarding your last comment...
Maybe I misunderstood your comment but it sounds like you tried to use Numba for this code and it didn't give you a performance improvement. Is that correct?
I updated the README for running the latest dynamic model if you want to try to run it yourself. Code for the dynamic model is in the dyn-bfbgasf
folder. I have compared the results to the Matlab version and everything looks similar. Although the Matlab version does not give the overflow warnings but that's probably because it's using a different ODE solver.
And regarding your comment about the overflow warnings, I'm using the Radau solver with rtol=1e-6
. With this setting I get the overflow warnings but the final results seem to be fine. I also tried rtol=1e-8, atol=1e-8
but this caused even more overflow warnings. It looks like it also caused some problems related to the Jacobian (see below). I eventually quit the program because it was taking too long to find a solution.
dyn-bfbgasf/solid_phase.py:133: RuntimeWarning: overflow encountered in power
+ hps * (Tp - Ts)
dyn-bfbgasf/solid_phase.py:313: RuntimeWarning: overflow encountered in power
- hps * (Tp - Ts)
dyn-bfbgasf/solid_phase.py:397: RuntimeWarning: overflow encountered in power
qwr = np.pi * Dwi * epb / ((1 - ep) / (ep * epb) + (1 - ew) / ew + 1) * sc * (Tw**4 - Tp**4)
dyn-bfbgasf/gas_phase.py:230: RuntimeWarning: overflow encountered in multiply
SmgV = SmgG + Smgs * (ug + v) - (Smgp - Smgg) * ug - SmgF
dyn-bfbgasf/gas_phase.py:197: RuntimeWarning: overflow encountered in multiply
+ Re_dc * 73.69 / (Re_dc + 5.378 * np.exp(6.2122 * sfc)) * np.exp(-5.0748 * sfc)
dyn-bfbgasf/solid_phase.py:245: RuntimeWarning: overflow encountered in multiply
+ Re_dc * 73.69 / (Re_dc + 5.378 * np.exp(6.2122 * sfc)) * np.exp(-5.0748 * sfc)
dyn-bfbgasf/solid_phase.py:126: RuntimeWarning: overflow encountered in true_divide
Re_dc = abs(rhog) * abs(-ug - v) * ds / mu
dyn-bfbgasf/gas_phase.py:193: RuntimeWarning: overflow encountered in true_divide
Re_dc = rhog * np.abs(-ug - v) * ds / mu
dyn-bfbgasf/gas_phase.py:196: RuntimeWarning: invalid value encountered in multiply
24 / Re_dc * (1 + 8.1716 * Re_dc**(0.0964 + 0.5565 * sfc) * np.exp(-4.0655 * sfc))
dyn-bfbgasf/gas_phase.py:197: RuntimeWarning: invalid value encountered in true_divide
+ Re_dc * 73.69 / (Re_dc + 5.378 * np.exp(6.2122 * sfc)) * np.exp(-5.0748 * sfc)
dyn-bfbgasf/gas_phase.py:294: RuntimeWarning: overflow encountered in true_divide
Re_dc = abs(rhog) * abs(-ug - v) * ds / mu
dyn-bfbgasf/solid_phase.py:241: RuntimeWarning: overflow encountered in true_divide
Re_dc = abs(rhog) * abs(-ug - v) * ds / mu
dyn-bfbgasf/solid_phase.py:244: RuntimeWarning: invalid value encountered in multiply
24 / Re_dc * (1 + 8.1716 * Re_dc**(0.0964 + 0.5565 * sfc) * np.exp(-4.0655 * sfc))
dyn-bfbgasf/solid_phase.py:245: RuntimeWarning: invalid value encountered in true_divide
+ Re_dc * 73.69 / (Re_dc + 5.378 * np.exp(6.2122 * sfc)) * np.exp(-5.0748 * sfc)
/Users/gavinw/miniconda3/lib/python3.7/site-packages/scipy/integrate/_ivp/common.py:336: RuntimeWarning: overflow encountered in multiply
new_factor = NUM_JAC_FACTOR_INCREASE * factor[ind]
dyn-bfbgasf/solid_phase.py:133: RuntimeWarning: invalid value encountered in add
+ hps * (Tp - Ts)
/Users/gavinw/miniconda3/lib/python3.7/site-packages/scipy/integrate/_ivp/common.py:358: RuntimeWarning: overflow encountered in multiply
factor[max_diff < NUM_JAC_DIFF_SMALL * scale] *= NUM_JAC_FACTOR_INCREASE
The current implementation of the model using the Radau method with default tolerances does not converge to a solution. The model will run for several minutes and eventually output the warnings shown below. I terminated the program after 26 minutes because the solver had not obtained a solution. I'm not sure what's going on but suggestions on how to use SciPy's
solve_ivp
function for this gasifier model would be helpful.