Closed Robbybp closed 8 months ago
@sbugosen Can you work in a different branch? I intend to come back to this to test the refactored code, and would like the branch to be as I left it on Friday. Don't worry about the commits you've pushed so far, but please don't push any further commits to this branch.
With these updates, we can run the ALAMO parameter sweep with:
python run_alamo_sweep.py --fname="alamo-sweep-n5.csv" --n1=5 --n2=5
This runs the parameter sweep with five sampled parameter values for each of conversion and pressure, and stores the results in DATA_DIR/alamo-sweep-n5.csv
, where DATA_DIR
is, by default, ./data
. n1
and n2
both default to 8, for the same parameters we have used in the paper results.
We can then validate the results with:
python validate_sweep_results.py data/alamo-sweep-n5.csv
This will write the validation results (feasibility and objective value of each set of inputs) to data/alamo-sweep-n5-validation.csv
. If we have a baseline file (say from a full-space parameter sweep), we can generate the fractional objective differences (in the same output file) with
python validate_sweep_results.py data/alamo-sweep-n5.csv --baseline-fpath=data/fullspace-sweep.csv
Here, the full-space parameter sweep needs to contain the same input parameters as the parameter sweep we are validating.
To run the validation for a single set of parameters (e.g. to inspect/debug a failure), we can run:
python validate_inputs.py data/alamo-sweep-n5.csv --row=10
to run the full space simulation with the inputs specified by row 10 of the parameter sweep.
When I run the ALAMO solve and validation parameter sweep scripts, I get results that you can see in the attached files. sweep_results_alamo-validation.csv sweep_results_alamo.csv
Note that there are two instances that fail the validation: input indices 53 and 55. @sbugosen I believe you mentioned that you have used code in this branch to reproduce the ALAMO sweep results from the paper. Can you confirm, and send me the results you obtained if so? Also, when you have time, can you run the default ALAMO sweep workflow on this branch and let me know if you also see these two validation failures? The default workflow is:
python run_alamo_sweep.py --fname=alamo-sweep.csv
python validate_sweep_results.py data/alamo-sweep.csv
You can include a baseline file if you have one ready, but I am primarily interested in seeing whether these two simulation failures occur.
Inspecting one of the failures, row 55, I see a solve log that looks like this:
This is Ipopt version 3.13.2, running with linear solver ma27. [1/9494]
Number of nonzeros in equality constraint Jacobian...: 13
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 5
Total number of variables............................: 6
variables with only lower bounds: 0
variables with lower and upper bounds: 2
variables with only upper bounds: 0
Total number of equality constraints.................: 6
Total number of inequality constraints...............: 0
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 0
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 0.0000000e+00 6.27e+07 0.00e+00 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0
1 0.0000000e+00 1.30e+08 3.63e+00 -1.0 4.56e+03 - 2.75e-01 5.00e-01h 2
2 0.0000000e+00 4.52e+07 8.34e+00 -1.0 4.61e+07 - 1.51e-01 1.00e+00h 1
3 0.0000000e+00 2.27e+07 6.63e+00 -1.0 3.41e+07 - 5.54e-01 2.50e-01h 3
4 0.0000000e+00 7.71e+05 5.04e-01 -1.0 1.36e+07 - 9.25e-01 1.00e+00h 1
5 0.0000000e+00 3.85e+03 3.73e-03 -1.0 1.64e+05 - 9.95e-01 1.00e+00h 1
6 0.0000000e+00 5.76e-02 1.13e-07 -1.0 6.57e+02 - 1.00e+00 1.00e+00h 1
7 0.0000000e+00 4.47e-08 7.09e-14 -2.5 9.65e-03 - 1.00e+00 1.00e+00h 1
8 0.0000000e+00 2.98e-08 1.54e-21 -3.8 3.82e-08 - 1.00e+00 1.00e+00 0
9 0.0000000e+00 1.49e-08 5.38e-23 -5.7 1.51e-08 - 1.00e+00 1.00e+00T 0
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
10 0.0000000e+00 1.49e-08 7.28e-25 -8.6 2.74e-09 - 1.00e+00 1.00e+00T 0
11 0.0000000e+00 1.49e-08 8.08e-28 -9.0 2.74e-09 - 1.00e+00 1.00e+00T 0
Number of Iterations....: 11
(scaled) (unscaled)
Objective...............: 0.0000000000000000e+00 0.0000000000000000e+00
Dual infeasibility......: 8.0779356694631609e-28 8.0779356694631609e-28
Constraint violation....: 1.4901161193847656e-08 1.4901161193847656e-08
Complementarity.........: 9.0909090909090910e-10 9.0909090909090910e-10
Overall NLP error.......: 1.4901161193847656e-08 1.4901161193847656e-08
Number of objective function evaluations = 21
Number of objective gradient evaluations = 12
Number of equality constraint evaluations = 21
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 12
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 12
Total CPU secs in IPOPT (w/o function evaluations) = 0.001
Total CPU secs in NLP function evaluations = 0.001
EXIT: Search Direction is becoming Too Small.
This seems like a bug in either Ipopt or our Ipopt interface, but seems to go away when I use MA57 as the linear solver. I'd also expect this error to go away if we use a different Ipopt version. I'm currently re-running the validation with MA57 to make sure the results look okay.
Hi Robert, I just finished cleaning up what I got so far for the relaxation.
I'm just looking at the validation fails, and the main difference I see from what I did some time ago is the pressure range. I use a step change in pressure of 70000, while you are using a step of 71428. This small difference might be causing that discrepancy in indices 53 and 55. I will validate my previous results (with a step of 70000) to see if I get problems for indices 53 and 55.
El dom, 25 feb 2024 a las 12:33, Robert Parker @.***>) escribió:
Inspecting one of the failures, row 55, I see a solve log that looks like this:
This is Ipopt version 3.13.2, running with linear solver ma27. [1/9494] Number of nonzeros in equality constraint Jacobian...: 13Number of nonzeros in inequality constraint Jacobian.: 0Number of nonzeros in Lagrangian Hessian.............: 5 Total number of variables............................: 6 variables with only lower bounds: 0 variables with lower and upper bounds: 2 variables with only upper bounds: 0Total number of equality constraints.................: 6Total number of inequality constraints...............: 0 inequality constraints with only lower bounds: 0 inequality constraints with lower and upper bounds: 0 inequality constraints with only upper bounds: 0 iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls 0 0.0000000e+00 6.27e+07 0.00e+00 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0 1 0.0000000e+00 1.30e+08 3.63e+00 -1.0 4.56e+03 - 2.75e-01 5.00e-01h 2 2 0.0000000e+00 4.52e+07 8.34e+00 -1.0 4.61e+07 - 1.51e-01 1.00e+00h 1 3 0.0000000e+00 2.27e+07 6.63e+00 -1.0 3.41e+07 - 5.54e-01 2.50e-01h 3 4 0.0000000e+00 7.71e+05 5.04e-01 -1.0 1.36e+07 - 9.25e-01 1.00e+00h 1 5 0.0000000e+00 3.85e+03 3.73e-03 -1.0 1.64e+05 - 9.95e-01 1.00e+00h 1 6 0.0000000e+00 5.76e-02 1.13e-07 -1.0 6.57e+02 - 1.00e+00 1.00e+00h 1 7 0.0000000e+00 4.47e-08 7.09e-14 -2.5 9.65e-03 - 1.00e+00 1.00e+00h 1 8 0.0000000e+00 2.98e-08 1.54e-21 -3.8 3.82e-08 - 1.00e+00 1.00e+00 0 9 0.0000000e+00 1.49e-08 5.38e-23 -5.7 1.51e-08 - 1.00e+00 1.00e+00T 0iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls 10 0.0000000e+00 1.49e-08 7.28e-25 -8.6 2.74e-09 - 1.00e+00 1.00e+00T 0 11 0.0000000e+00 1.49e-08 8.08e-28 -9.0 2.74e-09 - 1.00e+00 1.00e+00T 0 Number of Iterations....: 11 (scaled) (unscaled)Objective...............: 0.0000000000000000e+00 0.0000000000000000e+00Dual infeasibility......: 8.0779356694631609e-28 8.0779356694631609e-28Constraint violation....: 1.4901161193847656e-08 1.4901161193847656e-08Complementarity.........: 9.0909090909090910e-10 9.0909090909090910e-10Overall NLP error.......: 1.4901161193847656e-08 1.4901161193847656e-08
Number of objective function evaluations = 21Number of objective gradient evaluations = 12Number of equality constraint evaluations = 21Number of inequality constraint evaluations = 0Number of equality constraint Jacobian evaluations = 12Number of inequality constraint Jacobian evaluations = 0Number of Lagrangian Hessian evaluations = 12Total CPU secs in IPOPT (w/o function evaluations) = 0.001Total CPU secs in NLP function evaluations = 0.001 EXIT: Search Direction is becoming Too Small.
This seems like a bug in either Ipopt or our Ipopt interface, but seems to go away when I use MA57 as the linear solver. I'd also expect this error to go away if we use a different Ipopt version. I'm currently re-running the validation with MA57 to make sure the results look okay.
— Reply to this email directly, view it on GitHub https://github.com/Robbybp/surrogate-vs-implicit/pull/13#issuecomment-1963038326, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2YFNXUQMYCPUL6VQJJOAEDYVOGXPAVCNFSM6AAAAABCM43N4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRTGAZTQMZSGY . You are receiving this because you were mentioned.Message ID: @.***>
Find attached the results you were inquiring about. The main difference is our step size in pressure.
I use a step change in pressure of 70000, while you are using a step of 71428. This small difference might be causing that discrepancy in indices 53 and 55
That makes sense! Thanks for pointing this out. I just chose an even spacing between the upper and lower bounds I saw in your parameter sweep files. I don't have a strong preference on exactly which pressure values we use, as long as the results we are getting are consistent and reproducible.
I'm currently re-running the validation with MA57 to make sure the results look okay.
3 validations fail when I use MA57. I'm currently re-running with Ipopt 3.14 (via CyIpopt) and MA27. Then I will try running with the original pressure values.
Developing these analyses is a bit cumbersome because solve_strongly_connected_components
takes so long. Here's a breakdown of where it's spending the most time:
Identifier ncalls cumtime percall %
-------------------------------------------------
root 1 44.159 44.159 100.0
--------------------------------------------
block 547 12.353 0.023 28.0
calc-var 530 23.566 0.044 53.4
igraph 1 7.675 7.675 17.4
solve 16 0.558 0.035 1.3
other n/a 0.006 n/a 0.0
============================================
=================================================
"block" is time spent constructing temporary blocks for the solver, "calc-var" is time spent in calculate_variable_from_constraint
, "igraph" is time spent constructing the IncidenceGraphInterface
, and "solve" is time spent solving with Ipopt (including the nl file write).
I'm pretty sure this can all be done faster, as solving the full problem after solve_strongly_connected_components
only takes 0.36 s.
With some very simple improvements (switching igraph
to use IncidenceMethod.ampl_repn
and not re-constructing the graph when block triangularizing), we can make a dent:
Identifier ncalls cumtime percall %
-------------------------------------------------------------------
root 1 32.212 32.212 100.0
--------------------------------------------------------------
calc-var 530 23.002 0.043 71.4
generate-scc 547 8.333 0.015 25.9
--------------------------------------------------
block-triang 1 0.034 0.034 0.4
generate-block 547 8.297 0.015 99.6
other n/a 0.002 n/a 0.0
==================================================
igraph 1 0.343 0.343 1.1
solve 16 0.528 0.033 1.6
other n/a 0.006 n/a 0.0
==============================================================
===================================================================
Here, generate-scc
is block
above. We can probably improve the calc-var overhead by switching to use an ASL-based solver even for the 1x1 blocks.
By not creating blocks for the 1x1 solves, we can reduce a lot of the block-generation overhead:
Identifier ncalls cumtime percall %
----------------------------------------------------------------------
root 1 26.318 26.318 100.0
-----------------------------------------------------------------
block-triang 1 0.033 0.033 0.1
calc-var 530 22.861 0.043 86.9
igraph 1 0.344 0.344 1.3
solve 16 0.511 0.032 1.9
subsystem-blocks 1 2.565 2.565 9.7
-------------------------------------------------
block 16 0.000 0.000 0.0
external-fcns 16 1.928 0.120 75.2
identify-vars 16 0.635 0.040 24.8
reference 32 0.001 0.000 0.1
other n/a 0.000 n/a 0.0
=================================================
other n/a 0.005 n/a 0.0
=================================================================
======================================================================
Here, subsystem-blocks
is roughly equivalent to generate-block
above.
With a quick refactor of the external function identification visitor to not repeat work for multiple instances of the same named expression, we can improve performance a bit:
Identifier ncalls cumtime percall %
----------------------------------------------------------------------
root 1 24.738 24.738 100.0
-----------------------------------------------------------------
block-triang 1 0.033 0.033 0.1
calc-var 530 23.144 0.044 93.6
igraph 1 0.342 0.342 1.4
solve 16 0.516 0.032 2.1
subsystem-blocks 1 0.700 0.700 2.8
-------------------------------------------------
block 16 0.000 0.000 0.1
external-fcns 16 0.068 0.004 9.8
identify-vars 16 0.629 0.039 89.9
reference 32 0.002 0.000 0.2
other n/a 0.000 n/a 0.0
=================================================
other n/a 0.005 n/a 0.0
=================================================================
======================================================================
Now it's just calculate_variable_from_constraint
. I think that, to avoid this bottleneck, we need to implement a BT solver that only uses a single PyomoNLP
(a single nl file). My experience is that this is also not performant, as it leads to quadratic scaling when evaluating constraints (and Jacobians) in a loop. I will try to implement this and test it when I have some time.
These performance improvements are in my scc-performance
branch of Pyomo.
calculate_variable_from_constraint
, use SciPy Newton solvers, which use ASL for function/derivative evaluation.SccImplicitFunctionSolver
, which constructs an NLP for each diagonal block of the block triangularization
Identifier ncalls cumtime percall %
-------------------------------------------------------------------------
root 1 8.710 8.710 100.0
--------------------------------------------------------------------
__init__ 1 8.262 8.262 94.8
------------------------------------------------------
NlpSolver 1 0.016 0.016 0.2
PyomoNLP 1 1.072 1.072 13.0
partition 1 0.389 0.389 4.7
other n/a 6.784 n/a 82.1
======================================================
igraph 1 0.339 0.339 3.9
set_parameters 1 0.088 0.088 1.0
------------------------------------------------------
solve 546 0.085 0.000 96.4
---------------------------------------------
solve_nlp 546 0.084 0.000 99.0
other n/a 0.001 n/a 1.0
=============================================
other n/a 0.003 n/a 3.6
======================================================
other n/a 0.021 n/a 0.2
====================================================================
=========================================================================
This uses a completely different implementation of the BT solver, so the timing categories are not really comparable to those above. However, this performance is pretty good. We're bottlenecked by initialization of data structures, which could be eliminated if we could just use a single NLP. I think this would introduce a new bottleneck with our current ASL interface, but it is worth testing.
I ran the ALAMO sweep with Ipopt 3.14.14 ma27, and the validation with Ipopt 3.13.2 ma27. I got three failed instances (index 5,22,55), although the objective provided by these 3 failed instances is very close to the real objective. Let me know if you want me to try specific solvers. alamo-sweep-validation.csv alamo-sweep.csv