admb-project / admb

AD Model Builder
http://admb-project.org
Other
64 stars 19 forks source link

Reduce noise in new console output #248

Closed arni-magnusson closed 2 years ago

arni-magnusson commented 2 years ago

The new console output is fantastic: a compact summary of the estimation phases and final fit, with single-line progress bars for uncertainty calculations at the end. Identifying parameters that have a large gradient or are near the bound is especially useful.

After testing the new console output on a wide variety of examples, my only criticism is that it's slightly too noisy near the end. Even after training my eyes on the new format, there's just too many progress bars and exclamations, making it difficult to read the important bits.

With this pull request, I would like to propose the following three modifications that reduce noise in the new console output:

  1. The warning paragraph about parameters near the bound should only be shown when it occurs. Think of glm() in R, which sometimes shows important warnings. It would be considered unnecessarily verbose to display a line in the midst of the GLM standard output saying "no warnings this time". Simply not showing a warning seems like a good way to show there is no warning.

  2. It seems like it should be enough to have two progress meters for the uncertainty calculations at the end. The tasks of (1) calculating the Hessian and (2) differentiating the derived quantities [Jacobian] can each take about as long as estimating the parameters, so it's insightful to see that breakdown: 10 sec, 9 sec, 11 sec. On the other hand, the tasks of (3) inverting the Hessian and the (4) standard error calculations take a tiny fraction of a second so those progress meters can probably be removed to keep the standard output relevant and to the point.

  3. Finally, a couple of cosmetic changes, removing the second colon from "Warning: the following parameters had issues:" and removing the exclamation marks (!) at the end of the progress meters. It seems that the use of exclamation marks is appropriate when something unusual and bad happens, requiring the immediate attention of the user.

I also include the two commits from bug fix #247 (which should probably be merged first) to avoid merge conflicts.

johnoel commented 2 years ago

Also, instead of using percentages, is incrementing by 5s or 10s better?

Calculating Hessian (28 variables): ...5, ...10, ...15, ...20, ...25, ...28 done

arni-magnusson commented 2 years ago

I'm happy with Calculating Hessian (38 variables): 20%, 40%, 60%, 80%, 100% done (0.019 s)

A problem with incrementing by 5s or 10s is the amount of console output when working with highly parameterized models. As an example, a yellowfin tuna model I've inherited has 11671 estimated parameters :)

Cole-Monnahan-NOAA commented 2 years ago

Hi @arni-magnusson thanks for taking a closer look (you're probably the only one). Frankly, I'm a bit lost in the git history between the forks and branches. I'm working on getting the 13.0 version running locally and will test it then. But some immediate thoughts:

(1) Didn't it already do this? For a model without any issues it says "Checking for estimated parameters on bounds... none found!" At least on my machine.

(2) I disagree with "On the other hand, the tasks of (3) inverting the Hessian and the (4) standard error calculations take a tiny fraction of a second so those progress meters can probably be removed to keep the standard output relevant and to the point." This is highly model dependent. The model may be trivial to evaluate (e.g., iid Z), but have 11k parameters and thus the matrix inversion is the most costly. Likewise, you might project a complex, but small (n=5 pars) model out 100 years and thus have a huge burden to do the delta method. I strongly feel all separate components are worth seeing for some models, and thus should all be there.

(3) Yeah that's a nice cleanup.

Cole-Monnahan-NOAA commented 2 years ago

And yes regarding the percentage vs count, I wanted it all to fit on one line so when there's few parameters I do it by count, otherwise percentile.

johnoel commented 2 years ago

Yes, percentiles are best for large number of parameters, but for less than 100 incrementing seems better. Seems odd to run percentiles on say 11 parameters.

arni-magnusson commented 2 years ago

Hi Cole and Johnoel,

I've only tested ADMB 13.0 in Linux and it builds smoothly. Good thing you guys are testing MinGW64 - better to catch errors now than later. Meanwhile, if you have a Linux machine handy, you can clone https://github.com/arni-magnusson/admb, switch to the admb-13-quiet, build ADMB and run models.

  1. In my opinion, it's not necessary to output "Checking for estimated parameters on bounds... none found!" when no parameters are on the bound. I think it's better to not say anything when this warning does not occur. Cleaner output and more in line with how warnings work in R, for example. If people strongly prefer to keep this 'non-message' I'm of course okay with that :)
  2. Thanks for the feedback on this. My comments about relative time spent on computational tasks were based on a variety of models I was able to find, but your explanation is convincing. I'll adapt the pull request to keep the timing messages of inverting the Hessian and the standard error calculations.
  3. Cool :)
Cole-Monnahan-NOAA commented 2 years ago

Tagging @wStockhausen for his thoughts too.

@arni-magnusson point (1) is fair. I guess I thought of it as a bit of a reassurance. Users may not know that test is being done internally, since it's new, without an explicit message. I'm open to dropping it though. Couldn't you make the same argument for things like the terminal mag? We'd only show it if it were bad (didn't pass some test like <0.001)? Also note that a model can pass bound tests in default mode, but fail when run with -hbf due to transformation differences.

@johnoel egarding the percentiles, yes perhaps it makes sense to switch to when n>100 rather than n>10? Seems an easy update and logical.

wStockhausen commented 2 years ago

Sounds good to me.

William T. Stockhausen

Research Fishery Biologist, Alaska Fisheries Science Center

NOAA Fisheries | U.S. Department of Commerce

Office: (206) 526-4241

www.fisheries.noaa.gov

On Wed, May 11, 2022 at 1:07 PM Cole Monnahan @.***> wrote:

Tagging @wStockhausen https://github.com/wStockhausen for his thoughts too.

@arni-magnusson https://github.com/arni-magnusson point (1) is fair. I guess I thought of it as a bit of a reassurance. Users may not know that test is being done internally, since it's new, without an explicit message. I'm open to dropping it though. Couldn't you make the same argument for things like the terminal mag? We'd only show it if it were bad (didn't pass some test like <0.001)? Also note that a model can pass bound tests in default mode, but fail when run with -hbf due to transformation differences.

@johnoel https://github.com/johnoel egarding the percentiles, yes perhaps it makes sense to switch to when n>100 rather than n>10? Seems an easy update and logical.

— Reply to this email directly, view it on GitHub https://github.com/admb-project/admb/pull/248#issuecomment-1124239718, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMAD5EEBJXSLLW7ETNVLHLVJQHPPANCNFSM5VCNRCEQ . You are receiving this because you were mentioned.Message ID: @.***>

Cole-Monnahan-NOAA commented 2 years ago

@arni-magnusson @wStockhausen

One thing that is now broken is when function_minimizer::output_flag == 0 it still prints all this stuff. That should be a complete suppression of standard console output (for e.g. giant simulations).

I'm not sure where to make these modifications now since we've got so many forks and branches. @johnoel can you advise? Maybe easiest to do it on Arni's repo here so it goes into this PR?

johnoel commented 2 years ago

Created arni-magnusson-admb-13-quiet branch that has Arni's admb-13-quiet merged with admb-13.0. You can use it for testing and development.

Btw, I changed the command line option from -output to -display to avoid confusion. Mollie thought it changed the output files instead of the just the display.

Cole-Monnahan-NOAA commented 2 years ago

@johnoel Thanks for making that new branch. I was able to pull it and test locally with success. Also a good idea to use "display" instead of "output" -- that's more informative. Do you need me to help write the dox?

@arni-magnusson are you willing to make those updates into the code here so that we can merge in the PR directly?

(1) Put the outputs for hess inversion and Sd calcs back in (2) Probably leave off the message for bound checks, if there are none. (3) ignore my last message about '-display 0' as it now works (I was using 'output')

johnoel commented 2 years ago

Reverted changes for (1) in arni-magnusson-admb-13-quiet branch. Created a new pull #250. Please update and test.

Cole-Monnahan-NOAA commented 2 years ago

Tested and confirmed functionality is good. Ready for this to be merged in.

johnoel commented 2 years ago

Pull #250 will be merged instead of this one.