Ensure idempotence of model evaluations

jonmaddock commented 1 month ago

To try to ensure idempotency in the result of the model evaluations whilst being efficient, the idea is to:

When optimising, ensure idempotence for objective function and constraints ONLY
When not optimising or outputting final result, ensure idempotence for all variables in mfile

This will at least double model evaluations (and hence code runtime), as a comparison of previous and current results for the same optimisation parameter vector is required in each case.

jonmaddock commented 1 month ago

I tested this using an optimising (large tokamak regression test) and non-optimising (once-through at LT solution vector) case.

When non-optimising, debug logs give:

process.caller - DEBUG - New mfile created: evaluating models again to check idempotence
process.caller - DEBUG - Mfiles not idempotent, evaluating models again
process.caller - DEBUG - Mfiles not idempotent, evaluating models again
process.caller - DEBUG - Mfiles not idempotent, evaluating models again
process.caller - DEBUG - Mfiles idempotent, returning

After the 4th model evaluation the mfiles converge, 5 evaluations are required to check this.

In the optimising case:

process.caller - DEBUG - New optimisation parameter vector being evaluated
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
process.caller - DEBUG - Model evaluations idempotent, returning objective function and constraints
process.caller - DEBUG - New optimisation parameter vector being evaluated
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
process.caller - DEBUG - Model evaluations idempotent, returning objective function and constraints
process.caller - DEBUG - New optimisation parameter vector being evaluated
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
*** ... Solver continues ... ***
process.caller - DEBUG - New optimisation parameter vector being evaluated
process.caller - DEBUG - Model evaluations idempotent, returning objective function and constraints
*** Solution found ***
process.caller - DEBUG - New mfile created: evaluating models again to check idempotence
process.caller - DEBUG - Mfiles idempotent, returning

Varying numbers of re-evaluations are required to gain idempotency when solving: this only requires the objective function and constraints to converge. Once a solution has been found however, it is immediately found to be idempotent, possibly because the solver steps have become so small.

Summary

I believe this is now working, and should fix the problem. The main contentious implementation detail is the re-directing of OUT.DAT and MFILE.DAT output to temporary IDEM_OUT.DAT and IDEM_MFILE.DAT files for the purposes of checking idempotence. This was done because the OUT.DAT and MFILE.DAT writing is currently intertwined in Fortran, and difficult to easily separate due to being written all over the place. The scan functionality also needs to be preserved. But mainly the imminent conversion to Python trumps a thorough Fortran refactor here; indeed this could then be done without file IO at all.

LT regression differences

This has caused some differences to the large tokamak solution. Of particular note: -19% change in pcurrentdrivemw (Power injected for current drive (MW)) +2.76% t_structural_radial (central solenoid structural radial thickness (m))

mkovari commented 1 month ago

In large_tokamak, pcurrentdrivemw has changed from 5.93 to 4.75 MW. This is bigger than I would have expected but not a big deal. (The auxiliary current drive in Amps auxiliary_cd and the auxiliary current drive fraction faccd have changed by about the same percentage for the same reason.)

n_cycle ( Allowable number of load cycles till CS fracture) was initially zero for some reason and has changed to 135. eq_con030 (not sure which constraint this refers to) has increased from -4.35e-09 to 1.87e-4, which is a bit big for comfort.

The CS fatigue constraint does not seem to be active, so I don't think the change in n_cycle is the cause of the other changes. Is there an easy way for me to find the complete output files, or if not, can you post them here?

mkovari commented 1 month ago

Can you list which variables are changing most between successive evaluations?

jonmaddock commented 1 month ago

@mkovari I've sent you the non-idempotent and idempotent OUT.DATs and MFILE.DATs for the large tokamak case. I've also shared the presentation that goes into detail about the individual variable changes between iterations.

mkovari commented 1 month ago

n_cycle ( Allowable number of load cycles till CS fracture) was initially zero for some reason and has changed to 135.

After investigation this change is completely legitimate. When the initial "radial" crack size is more than half the radial thickness of the conduit, the fatigue life is zero. By chance it is just under half in the "idempotent" version, and just over in the "non-idempotent" version.

For discussion of how to ensure sufficient fatigue life, see #3191.

mkovari commented 1 month ago

Many of the variables that change the most when the models are re-evaluated are related to the CS, the burn time and the flux swing. We already know that this calculation seems rather circular and very hard to understand. If anyone is feeling clever they should attempt to rewrite this - my brain is too small at the moment.

This problem should not stop this pull request from going ahead.

jonmaddock commented 1 month ago

Couple of minor quality comments. I think if we are going to make such a change it needs to be very clear where files are being written.

We have discussed the possibility of adding a user-defined iteration limit (rather than hard coded 10). Is this something we still want to do?

Thanks @timothy-nunn. Hopefully I've made it clear now where the output files are being written, and whether they're intermediary or final outputs. Regarding the idea of the user-defined model evaluation limit, I think this was mainly a concern in the case of huge slowdowns or Process; by splitting up the the re-evaluations into "solver-required idempotence only" vs. "full final mfile idempotence", the number of total evaluations is purely necessary to get correct results; any fewer and there's a real risk of the solver taking the wrong path or the output being incorrect.

I'm not sure there's much need currently to raise it beyond 10 evaluations; all runs so far indicate idempotence is achieved in 5 evaluations at most. It could be required in future however (for a particular input file/future model modification), but I'm inclined to include it when/if it's required.

I'd appreciate a second review!

jonmaddock commented 4 weeks ago

Thanks @timothy-nunn, I think I've responded to your comments. I've rebased to allow you to check the regression tests.

timothy-nunn commented 3 weeks ago

Thanks @jonmaddock.

The large_tokamak_once_through file appears to have some large changes.

WARNING  test_process_input_files.py:110 Variable                     Ref             New        % Change
WARNING  test_process_input_files.py:113 ------------------------------------------------------------
WARNING  test_process_input_files.py:115 vsbrn                      0.382             279        72913.33
WARNING  test_process_input_files.py:115 pfwpmw                    0.0683            1.24         1709.38
WARNING  test_process_input_files.py:115 tcycle                  2.63e+03        9.92e+03          277.05
WARNING  test_process_input_files.py:115 coe                          161             505          214.61
WARNING  test_process_input_files.py:115 ineq_con016               -0.134          0.0132          109.85
WARNING  test_process_input_files.py:115 vsstt                        279             558           99.65
WARNING  test_process_input_files.py:115 pnetelmw                     346             405           16.99
WARNING  test_process_input_files.py:115 pnetelmw.                    346             405           16.99
WARNING  test_process_input_files.py:115 c2253                       14.6            17.1           16.99
WARNING  test_process_input_files.py:115 pnetelmw/powfmw             21.2            24.9           16.99
WARNING  test_process_input_files.py:115 pnetelmw/(powfmw+emultmw            17.9            20.9           16.99
WARNING  test_process_input_files.py:115 c242                        8.64            7.95           -8.01
WARNING  test_process_input_files.py:115 c243                        7.08            6.48           -8.52
WARNING  test_process_input_files.py:115 pacpmw                       607             547           -9.90
WARNING  test_process_input_files.py:115 cirpowfr                   0.603           0.536          -11.17
WARNING  test_process_input_files.py:115 cppa                        34.6            28.8          -16.64
WARNING  test_process_input_files.py:115 c2262                       34.6            28.8          -16.64
WARNING  test_process_input_files.py:115 psechtmw                     285             226          -20.64
WARNING  test_process_input_files.py:115 c226                         485             330          -32.11
WARNING  test_process_input_files.py:115 c2174                       11.6            7.53          -35.03
WARNING  test_process_input_files.py:115 cryv                    2.52e+04        1.64e+04          -35.03
WARNING  test_process_input_files.py:115 c2263                        342             192          -43.89
WARNING  test_process_input_files.py:115 crymw                        103            43.7          -57.79
WARNING  test_process_input_files.py:115 crypmw                       103            43.7          -57.79
WARNING  test_process_input_files.py:115 helpow_+_helpow_cryal/1.0d6         0.21          0.0885          -57.79
WARNING  test_process_input_files.py:115 qmisc/1.0d6               0.0651          0.0275          -57.79
WARNING  test_process_input_files.py:115 bktcycles               5.87e+04        1.56e+04          -73.48
WARNING  test_process_input_files.py:115 qac/1.0d6                 0.0874         0.00381          -95.65
WARNING  test_process_input_files.py:115 peakpoloidalpower        9.9e+03             180          -98.18

@mkovari @jmorris-uk any thoughts?

mkovari commented 3 weeks ago

Can you send or post the two OUT.DAT files?

All the best, Michael

From: Timothy @.> Sent: Monday, June 3, 2024 09:46 To: ukaea/PROCESS @.> Cc: Kovari, Michael D @.>; Mention @.> Subject: Re: [ukaea/PROCESS] Ensure idempotence of model evaluations (PR #3186)

Thanks @jonmaddockhttps://github.com/jonmaddock.

The large_tokamak_once_through file appears to have some large changes.

WARNING test_process_input_files:test_process_input_files.py:110 Variable WARNING test_process_input_files:test_process_input_files.py:113 WARNING test_process_input_files:test_process_input_files.py:115 vsbrn WARNING test_process_input_files:test_process_input_files.py:115 pfwpmw WARNING test_process_input_files:test_process_input_files.py:115 tcycle WARNING test_process_input_files:test_process_input_files.py:115 coe WARNING test_process_input_files:test_process_input_files.py:115 ineq_con016 WARNING test_process_input_files:test_process_input_files.py:115 vsstt WARNING test_process_input_files:test_process_input_files.py:115 pnetelmw WARNING test_process_input_files:test_process_input_files.py:115 pnetelmw. WARNING test_process_input_files:test_process_input_files.py:115 c2253 WARNING test_process_input_files:test_process_input_files.py:115 pnetelmw/powfmw WARNING test_process_input_files:test_process_input_files.py:115 WARNING test_process_input_files:test_process_input_files.py:115 c242 WARNING test_process_input_files:test_process_input_files.py:115 c243 WARNING test_process_input_files:test_process_input_files.py:115 pacpmw WARNING test_process_input_files:test_process_input_files.py:115 cirpowfr WARNING test_process_input_files:test_process_input_files.py:115 cppa WARNING test_process_input_files:test_process_input_files.py:115 c2262 WARNING test_process_input_files:test_process_input_files.py:115 psechtmw WARNING test_process_input_files:test_process_input_files.py:115 c226 WARNING test_process_input_files:test_process_input_files.py:115 c2174 WARNING test_process_input_files:test_process_input_files.py:115 cryv WARNING test_process_input_files:test_process_input_files.py:115 c2263 WARNING test_process_input_files:test_process_input_files.py:115 crymw WARNING test_process_input_files:test_process_input_files.py:115 crypmw WARNING test_process_input_files:test_process_inputfiles.py:115 WARNING test_process_input_files:test_process_input_files.py:115 qmisc/1.0d6 WARNING test_process_input_files:test_process_input_files.py:115 bktcycles WARNING test_process_input_files:test_process_input_files.py:115 qac/1.0d6 WARNING test_process_input_files:test_process_input_files.py:115 peakpoloidalpower Ref New % Change ------------------------------------------------------------ 0.382 279 72913.33 0.0683 1.24 1709.38 2.63e+03 9.92e+03 277.05 161 505 214.61 -0.134 0.0132 109.85 279 558 99.65 346 405 16.99 346 405 16.99 14.6 17.1 16.99 21.2 24.9 16.99 pnetelmw/(powfmw+emultmw 17.9 20.9 16.99 8.64 7.95 -8.01 7.08 6.48 -8.52 607 547 -9.90 0.603 0.536 -11.17 34.6 28.8 -16.64 34.6 28.8 -16.64 285 226 -20.64 485 330 -32.11 11.6 7.53 -35.03 2.52e+04 1.64e+04 -35.03 342 192 -43.89 103 43.7 -57.79 103 43.7 -57.79 helpow+_helpow_cryal/1.0d6 0.21 0.0885 -57.79 0.0651 0.0275 -57.79 5.87e+04 1.56e+04 -73.48 0.0874 0.00381 -95.65 9.9e+03 180 -98.18

@mkovarihttps://github.com/mkovari @jmorris-ukhttps://github.com/jmorris-uk any thoughts?

— Reply to this email directly, view it on GitHubhttps://github.com/ukaea/PROCESS/pull/3186#issuecomment-2144634352, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACKEVEGPVRUDR7L7WVY4CBTZFQUQHAVCNFSM6AAAAABH37BT5GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBUGYZTIMZVGI. You are receiving this because you were mentioned.

mkovari commented 3 weeks ago

Well this one is weird. The old version doesn't even add up correctly:

 Initial charge time for CS from zero current (s)                         (tramp)                     500.000     
 Plasma current ramp-up time (s)                                          (tohs)                      160.911     
 Heating time (s)                                                         (t_fusion_ramp)              10.000     
 Burn time (s)                                                            (tburn)                   7.289E+03  OP 
 Reset time to zero current for CS (s)                                    (tqnch)                     160.911     
 Time between pulses (s)                                                  (tdwell)                   1800.000     

 Total plant cycle time (s)                                               (tcycle)                  2.632E+03  OP

Part of the problem is that some of the times are calculated in physics.py, even though there is a separate routine for pulsed reactors: pulse.run. This is probably a mistake and makes it hard to get the calculations in the right order.

https://github.com/ukaea/PROCESS/blob/dfcda5924ae8403d1ecb22bad39f3b68bb9a5dc2/process/physics.py#L905 etc.

Anyway, all this just confirms that for the time being we need this revision.

jonmaddock commented 1 week ago

@timothy-nunn I think this is ready for a final review now I've resolved the conflicts.

ukaea / PROCESS

Ensure idempotence of model evaluations #3186

Summary

LT regression differences