Closed pernak18 closed 3 years ago
in the output diagnostics netCDF, these are the arrays with the trial
dimension: dCost_*
and trial_total_cost
, with dCost_*
existing for every component of the cost function. at least one of these guys must not have all of the trials appended to it
this looks to be a problem for all subsequent iterations, but only in the diagnostics. using Eli's data, i can try to reproduce the bug:
% pwd
/global/u1/p/pernak18/RRTMGP/g-point-reduction
% rm -rf xsecs-test/ workdir_band_*
% for WD in `ls -d ~emlawer/emlawer-g-point-reduction/workdir_band_*/`; do ln -s $WD; done
% cp $SCRATCH/RRTMGP/LW_cost-optimize-iter148.pickle LW_cost-optimize.pickle
% ln -s ~emlawer/emlawer-g-point-reduction/fullCF_top-layer/
then run the notebook with FL's LW cost function, with DIAGNOSTICS = True
and NITER = 149
, and the error can be reproduced.
fixed with c823b550d14004fab0e59fcbb8b21a5429b1e32d
if we reach full reduction in a given band (nGpt = 1
), we pop out the associated trial from the cost lists and re-evaluate the optimization (otherwise nothing would happen and we would end up in an infinite loop). the bug was that we didn't pop out the trial from the cost components and delta-cost component arrays, but we did for totalCost
, so there was an inconsistency in the trial
dimension.
this is first time we have gotten to the point of full-band reduction, so i would not be surprised if other similar bugs manifest
When working in the LW, FL says:
the way the code in the notebook works is cost computation, optimization determination, diagnostics, write pickle file for iteration, then write the flux and reduced k-distribution. in this case, the cost and optimization was done for iteration 149, but the failure is in the diagnostics, so no diagnostic, flux, or k-distribution netCDFs are written.