Closed jonmaddock closed 1 week ago
I tested this using an optimising (large tokamak regression test) and non-optimising (once-through at LT solution vector) case.
When non-optimising, debug logs give:
process.caller - DEBUG - New mfile created: evaluating models again to check idempotence
process.caller - DEBUG - Mfiles not idempotent, evaluating models again
process.caller - DEBUG - Mfiles not idempotent, evaluating models again
process.caller - DEBUG - Mfiles not idempotent, evaluating models again
process.caller - DEBUG - Mfiles idempotent, returning
After the 4th model evaluation the mfiles converge, 5 evaluations are required to check this.
In the optimising case:
process.caller - DEBUG - New optimisation parameter vector being evaluated
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
process.caller - DEBUG - Model evaluations idempotent, returning objective function and constraints
process.caller - DEBUG - New optimisation parameter vector being evaluated
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
process.caller - DEBUG - Model evaluations idempotent, returning objective function and constraints
process.caller - DEBUG - New optimisation parameter vector being evaluated
process.caller - DEBUG - Model evaluations not idempotent: evaluating again
*** ... Solver continues ... ***
process.caller - DEBUG - New optimisation parameter vector being evaluated
process.caller - DEBUG - Model evaluations idempotent, returning objective function and constraints
*** Solution found ***
process.caller - DEBUG - New mfile created: evaluating models again to check idempotence
process.caller - DEBUG - Mfiles idempotent, returning
Varying numbers of re-evaluations are required to gain idempotency when solving: this only requires the objective function and constraints to converge. Once a solution has been found however, it is immediately found to be idempotent, possibly because the solver steps have become so small.
I believe this is now working, and should fix the problem. The main contentious implementation detail is the re-directing of OUT.DAT and MFILE.DAT output to temporary IDEM_OUT.DAT and IDEM_MFILE.DAT files for the purposes of checking idempotence. This was done because the OUT.DAT and MFILE.DAT writing is currently intertwined in Fortran, and difficult to easily separate due to being written all over the place. The scan functionality also needs to be preserved. But mainly the imminent conversion to Python trumps a thorough Fortran refactor here; indeed this could then be done without file IO at all.
This has caused some differences to the large tokamak solution. Of particular note: -19% change in pcurrentdrivemw (Power injected for current drive (MW)) +2.76% t_structural_radial (central solenoid structural radial thickness (m))
In large_tokamak,
pcurrentdrivemw
has changed from 5.93 to 4.75 MW. This is bigger than I would have expected but not a big deal.
(The auxiliary current drive in Amps auxiliary_cd
and the auxiliary current drive fraction faccd
have changed by about the same percentage for the same reason.)
n_cycle
( Allowable number of load cycles till CS fracture) was initially zero for some reason and has changed to 135.
eq_con030
(not sure which constraint this refers to) has increased from -4.35e-09 to 1.87e-4, which is a bit big for comfort.
The CS fatigue constraint does not seem to be active, so I don't think the change in n_cycle
is the cause of the other changes.
Is there an easy way for me to find the complete output files, or if not, can you post them here?
Can you list which variables are changing most between successive evaluations?
@mkovari I've sent you the non-idempotent and idempotent OUT.DATs and MFILE.DATs for the large tokamak case. I've also shared the presentation that goes into detail about the individual variable changes between iterations.
n_cycle
( Allowable number of load cycles till CS fracture) was initially zero for some reason and has changed to 135.
After investigation this change is completely legitimate. When the initial "radial" crack size is more than half the radial thickness of the conduit, the fatigue life is zero. By chance it is just under half in the "idempotent" version, and just over in the "non-idempotent" version.
For discussion of how to ensure sufficient fatigue life, see #3191.
Many of the variables that change the most when the models are re-evaluated are related to the CS, the burn time and the flux swing. We already know that this calculation seems rather circular and very hard to understand. If anyone is feeling clever they should attempt to rewrite this - my brain is too small at the moment.
This problem should not stop this pull request from going ahead.
Couple of minor quality comments. I think if we are going to make such a change it needs to be very clear where files are being written.
We have discussed the possibility of adding a user-defined iteration limit (rather than hard coded 10). Is this something we still want to do?
Thanks @timothy-nunn. Hopefully I've made it clear now where the output files are being written, and whether they're intermediary or final outputs. Regarding the idea of the user-defined model evaluation limit, I think this was mainly a concern in the case of huge slowdowns or Process; by splitting up the the re-evaluations into "solver-required idempotence only" vs. "full final mfile idempotence", the number of total evaluations is purely necessary to get correct results; any fewer and there's a real risk of the solver taking the wrong path or the output being incorrect.
I'm not sure there's much need currently to raise it beyond 10 evaluations; all runs so far indicate idempotence is achieved in 5 evaluations at most. It could be required in future however (for a particular input file/future model modification), but I'm inclined to include it when/if it's required.
I'd appreciate a second review!
Thanks @timothy-nunn, I think I've responded to your comments. I've rebased to allow you to check the regression tests.
Thanks @jonmaddock.
The large_tokamak_once_through
file appears to have some large changes.
WARNING test_process_input_files.py:110 Variable Ref New % Change
WARNING test_process_input_files.py:113 ------------------------------------------------------------
WARNING test_process_input_files.py:115 vsbrn 0.382 279 72913.33
WARNING test_process_input_files.py:115 pfwpmw 0.0683 1.24 1709.38
WARNING test_process_input_files.py:115 tcycle 2.63e+03 9.92e+03 277.05
WARNING test_process_input_files.py:115 coe 161 505 214.61
WARNING test_process_input_files.py:115 ineq_con016 -0.134 0.0132 109.85
WARNING test_process_input_files.py:115 vsstt 279 558 99.65
WARNING test_process_input_files.py:115 pnetelmw 346 405 16.99
WARNING test_process_input_files.py:115 pnetelmw. 346 405 16.99
WARNING test_process_input_files.py:115 c2253 14.6 17.1 16.99
WARNING test_process_input_files.py:115 pnetelmw/powfmw 21.2 24.9 16.99
WARNING test_process_input_files.py:115 pnetelmw/(powfmw+emultmw 17.9 20.9 16.99
WARNING test_process_input_files.py:115 c242 8.64 7.95 -8.01
WARNING test_process_input_files.py:115 c243 7.08 6.48 -8.52
WARNING test_process_input_files.py:115 pacpmw 607 547 -9.90
WARNING test_process_input_files.py:115 cirpowfr 0.603 0.536 -11.17
WARNING test_process_input_files.py:115 cppa 34.6 28.8 -16.64
WARNING test_process_input_files.py:115 c2262 34.6 28.8 -16.64
WARNING test_process_input_files.py:115 psechtmw 285 226 -20.64
WARNING test_process_input_files.py:115 c226 485 330 -32.11
WARNING test_process_input_files.py:115 c2174 11.6 7.53 -35.03
WARNING test_process_input_files.py:115 cryv 2.52e+04 1.64e+04 -35.03
WARNING test_process_input_files.py:115 c2263 342 192 -43.89
WARNING test_process_input_files.py:115 crymw 103 43.7 -57.79
WARNING test_process_input_files.py:115 crypmw 103 43.7 -57.79
WARNING test_process_input_files.py:115 helpow_+_helpow_cryal/1.0d6 0.21 0.0885 -57.79
WARNING test_process_input_files.py:115 qmisc/1.0d6 0.0651 0.0275 -57.79
WARNING test_process_input_files.py:115 bktcycles 5.87e+04 1.56e+04 -73.48
WARNING test_process_input_files.py:115 qac/1.0d6 0.0874 0.00381 -95.65
WARNING test_process_input_files.py:115 peakpoloidalpower 9.9e+03 180 -98.18
@mkovari @jmorris-uk any thoughts?
Can you send or post the two OUT.DAT files?
All the best, Michael
From: Timothy @.> Sent: Monday, June 3, 2024 09:46 To: ukaea/PROCESS @.> Cc: Kovari, Michael D @.>; Mention @.> Subject: Re: [ukaea/PROCESS] Ensure idempotence of model evaluations (PR #3186)
Thanks @jonmaddockhttps://github.com/jonmaddock.
The large_tokamak_once_through file appears to have some large changes.
WARNING test_process_input_files:test_process_input_files.py:110 Variable Ref New % Change WARNING test_process_input_files:test_process_input_files.py:113 ------------------------------------------------------------ WARNING test_process_input_files:test_process_input_files.py:115 vsbrn 0.382 279 72913.33 WARNING test_process_input_files:test_process_input_files.py:115 pfwpmw 0.0683 1.24 1709.38 WARNING test_process_input_files:test_process_input_files.py:115 tcycle 2.63e+03 9.92e+03 277.05 WARNING test_process_input_files:test_process_input_files.py:115 coe 161 505 214.61 WARNING test_process_input_files:test_process_input_files.py:115 ineq_con016 -0.134 0.0132 109.85 WARNING test_process_input_files:test_process_input_files.py:115 vsstt 279 558 99.65 WARNING test_process_input_files:test_process_input_files.py:115 pnetelmw 346 405 16.99 WARNING test_process_input_files:test_process_input_files.py:115 pnetelmw. 346 405 16.99 WARNING test_process_input_files:test_process_input_files.py:115 c2253 14.6 17.1 16.99 WARNING test_process_input_files:test_process_input_files.py:115 pnetelmw/powfmw 21.2 24.9 16.99 WARNING test_process_input_files:test_process_input_files.py:115 pnetelmw/(powfmw+emultmw 17.9 20.9 16.99 WARNING test_process_input_files:test_process_input_files.py:115 c242 8.64 7.95 -8.01 WARNING test_process_input_files:test_process_input_files.py:115 c243 7.08 6.48 -8.52 WARNING test_process_input_files:test_process_input_files.py:115 pacpmw 607 547 -9.90 WARNING test_process_input_files:test_process_input_files.py:115 cirpowfr 0.603 0.536 -11.17 WARNING test_process_input_files:test_process_input_files.py:115 cppa 34.6 28.8 -16.64 WARNING test_process_input_files:test_process_input_files.py:115 c2262 34.6 28.8 -16.64 WARNING test_process_input_files:test_process_input_files.py:115 psechtmw 285 226 -20.64 WARNING test_process_input_files:test_process_input_files.py:115 c226 485 330 -32.11 WARNING test_process_input_files:test_process_input_files.py:115 c2174 11.6 7.53 -35.03 WARNING test_process_input_files:test_process_input_files.py:115 cryv 2.52e+04 1.64e+04 -35.03 WARNING test_process_input_files:test_process_input_files.py:115 c2263 342 192 -43.89 WARNING test_process_input_files:test_process_input_files.py:115 crymw 103 43.7 -57.79 WARNING test_process_input_files:test_process_input_files.py:115 crypmw 103 43.7 -57.79 WARNING test_process_input_files:test_process_inputfiles.py:115 helpow+_helpow_cryal/1.0d6 0.21 0.0885 -57.79 WARNING test_process_input_files:test_process_input_files.py:115 qmisc/1.0d6 0.0651 0.0275 -57.79 WARNING test_process_input_files:test_process_input_files.py:115 bktcycles 5.87e+04 1.56e+04 -73.48 WARNING test_process_input_files:test_process_input_files.py:115 qac/1.0d6 0.0874 0.00381 -95.65 WARNING test_process_input_files:test_process_input_files.py:115 peakpoloidalpower 9.9e+03 180 -98.18
@mkovarihttps://github.com/mkovari @jmorris-ukhttps://github.com/jmorris-uk any thoughts?
— Reply to this email directly, view it on GitHubhttps://github.com/ukaea/PROCESS/pull/3186#issuecomment-2144634352, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACKEVEGPVRUDR7L7WVY4CBTZFQUQHAVCNFSM6AAAAABH37BT5GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBUGYZTIMZVGI. You are receiving this because you were mentioned.
Well this one is weird. The old version doesn't even add up correctly:
Initial charge time for CS from zero current (s) (tramp) 500.000
Plasma current ramp-up time (s) (tohs) 160.911
Heating time (s) (t_fusion_ramp) 10.000
Burn time (s) (tburn) 7.289E+03 OP
Reset time to zero current for CS (s) (tqnch) 160.911
Time between pulses (s) (tdwell) 1800.000
Total plant cycle time (s) (tcycle) 2.632E+03 OP
Part of the problem is that some of the times are calculated in physics.py
, even though there is a separate routine for pulsed reactors: pulse.run
. This is probably a mistake and makes it hard to get the calculations in the right order.
Anyway, all this just confirms that for the time being we need this revision.
@timothy-nunn I think this is ready for a final review now I've resolved the conflicts.
To try to ensure idempotency in the result of the model evaluations whilst being efficient, the idea is to:
This will at least double model evaluations (and hence code runtime), as a comparison of previous and current results for the same optimisation parameter vector is required in each case.